Android Code Transformation



Instead of joining



It all started with the fact that I wanted to learn the subtleties of Gradle settings, to understand its capabilities in Android development (and indeed). I started with the life cycle and books , gradually wrote simple tasks, tried to create my first Gradle plugin (in buildSrc ) and then it started.







Deciding to do something close to the real world of Android development, he wrote a plugin that parses layout xml markup files and creates a Java object on them with links to the views. Then he indulged in the transformation of the application manifest (this was required by the real task on the working draft), since after the transformation the manifest took about 5k lines, and working in the IDE with such an xml file is quite difficult.







So I figured out how to generate code and resources for an Android project, but over time I wanted something more. There was an idea that it would be cool to transform AST (Abstract Syntax Tree) into compile time as Groovy does out of the box . Such metaprogramming opens up many possibilities, there would be a fantasy.







So that the theory was not just a theory, I decided to reinforce the study of the topic with the creation of something useful for Android development. The first thing that came to mind was the preservation of the state when recreating system components. Roughly speaking, saving variables in the Bundle is as simple as possible with minimal boilerplate.







Where to begin?



  1. First, you need to understand how to access the necessary files in the Gradle life cycle in an Android project, which we will then transform.
  2. Secondly, when we get the necessary files, we need to understand how to properly transform them.


Let's start in order:







Access files at compile time



Since we will receive files at compile time, we need a Gradle plugin that will intercept files and deal with transformation. The plugin in this case is as simple as possible. But first, I'll show you how the build.gradle



module file with the plugin looks like:







 apply plugin: 'java-gradle-plugin' apply plugin: 'groovy' dependencies { implementation gradleApi() implementation 'com.android.tools.build:gradle:3.5.0' implementation 'com.android.tools.build:gradle-api:3.5.0' implementation 'org.ow2.asm:asm:7.1' }
      
      





  1. apply plugin: 'java-gradle-plugin'



    says that it is a module with a grad plugin.
  2. apply plugin: 'groovy'



    this plugin is needed in order to be able to write on grooves (it doesnโ€™t matter here, you can write at least Groovy, at least Java, at least Kotlin, whatever it is) I was originally used to writing plugins on grooves, since it has dynamic typing and sometimes it can be useful, and if it is not needed, you can simply put the @TypeChecked



    annotation.
  3. implementation gradleApi()



    - connect the Gradle API dependency so that there is access to org.gradle.api.Plugin



    , org.gradle.api.Project



    , etc.
  4. 'com.android.tools.build:gradle:3.5.0'



    and 'com.android.tools.build:gradle-api:3.5.0'



    are needed to access the entities of the android plugin.
  5. 'com.android.tools.build:gradle-api:3.5.0'



    library for transforming bytecode, we'll talk about it later.


Let's move on to the plugin itself, as I said, it is quite simple:







 class YourPlugin implements Plugin<Project> { @Override void apply(@NonNull Project project) { boolean isAndroidApp = project.plugins.findPlugin('com.android.application') != null boolean isAndroidLib = project.plugins.findPlugin('com.android.library') != null if (!isAndroidApp && !isAndroidLib) { throw new GradleException( "'com.android.application' or 'com.android.library' plugin required." ) } BaseExtension androidExtension = project.extensions.findByType(BaseExtension.class) androidExtension.registerTransform(new YourTransform()) } }
      
      





Let's start with isAndroidApp



and isAndroidLib



, here we just check that this is an Android project / library, if not, throw an exception. Next, register YourTransform



in the android plugin through androidExtension



. YourTransform



is an entity for obtaining the necessary set of files and their possible transformation; it must inherit the abstract class com.android.build.api.transform.Transform



.







Let's YourTransform



directly to YourTransform



, first consider the main methods that need to be redefined:







 class YourTransform extends Transform { @Override String getName() { return YourTransform.simpleName } @Override Set<QualifiedContent.ContentType> getInputTypes() { return TransformManager.CONTENT_CLASS } @Override Set<? super QualifiedContent.Scope> getScopes() { return TransformManager.PROJECT_ONLY } @Override boolean isIncremental() { return false } }
      
      







Remained the most basic and most sweet method in which the transformation of the transform files transform(TransformInvocation transformInvocation)



will take place. Unfortunately, I couldnโ€™t find a normal explanation of how to work correctly with this method, I found only Chinese articles and a few examples without any particular explanation, here is one of the options.







What I understood while studying how to work with a transformer:







  1. All transformers are hooked to the chain assembly process. That is, you write the logic that will be squeezed into an already established process. After your transformer, another will work, etc.
  2. VERY IMPORTANT: even if you do not plan to transform any file, for example, you do not want to change the jar files that will arrive to you, they still need to be copied to your output directory without changing. This item follows from the first. If you do not transfer the file further along the chain to another transformer, then in the end the file simply will not exist.


Consider what the transform method should look like:







 @Override void transform( TransformInvocation transformInvocation ) throws TransformException, InterruptedException, IOException { super.transform(transformInvocation) transformInvocation.outputProvider.deleteAll() transformInvocation.inputs.each { transformInput -> transformInput.directoryInputs.each { directoryInput -> File inputFile = directoryInput.getFile() File destFolder = transformInvocation.outputProvider.getContentLocation( directoryInput.getName(), directoryInput.getContentTypes(), directoryInput.getScopes(), Format.DIRECTORY ) transformDir(inputFile, destFolder) } transformInput.jarInputs.each { jarInput -> File inputFile = jarInput.getFile() File destFolder = transformInvocation.outputProvider.getContentLocation( jarInput.getName(), jarInput.getContentTypes(), jarInput.getScopes(), Format.JAR ) FileUtils.copyFile(inputFile, destFolder) } } }
      
      





At the entrance to us comes TransformInvocation



, which contains all the necessary information for further transformations. First, we clean the directory where new transformInvocation.outputProvider.deleteAll()



files will be recorded, this is done, since the transformer does not support incremental assembly and you must delete old files before transformation.







Next, we go over all the inputs and in each input we go over the directories and jar files. You may notice that all jar files are simply copied to go further into the next transformer. Moreover, copying should occur in the directory of your transformer build/intermediates/transforms/YourTransform/...



The correct directory can be obtained using transformInvocation.outputProvider.getContentLocation



.







Consider a method that is already extracting specific files for modification:







 private static void transformDir(File input, File dest) { if (dest.exists()) { FileUtils.forceDelete(dest) } FileUtils.forceMkdir(dest) String srcDirPath = input.getAbsolutePath() String destDirPath = dest.getAbsolutePath() for (File file : input.listFiles()) { String destFilePath = file.absolutePath.replace(srcDirPath, destDirPath) File destFile = new File(destFilePath) if (file.isDirectory()) { transformDir(file, destFile) } else if (file.isFile()) { if (file.name.endsWith(".class") && !file.name.endsWith("R.class") && !file.name.endsWith("BuildConfig.class") && !file.name.contains("R\$")) { transformSingleFile(file, destFile) } else { FileUtils.copyFile(file, destFile) } } } }
      
      





At the entrance we get the directory with the source code and the directory where you want to write the modified files. We recursively go through all the directories and get the class files. Before the transformation, there is still a small check that allows you to weed out extra classes.







 if (file.name.endsWith(".class") && !file.name.endsWith("R.class") && !file.name.endsWith("BuildConfig.class") && !file.name.contains("R\$")) { transformSingleFile(file, destFile) } else { FileUtils.copyFile(file, destFile) }
      
      





So we got to the transformSingleFile



method, which already flows into the second paragraph of our original plan







Secondly, when we get the necessary files, we need to understand how to properly transform them.


Transformation in all its glory



For a less convenient transformation of the resulting class files, there are several libraries: javassist , allows you to modify both the bytecode and the source code (it is not necessary to dive into the study of bytecode) and ASM , which allows you to modify only the bytecode and has 2 different APIs.







I opted for ASM, as it was interesting to dive into the bytecode structure and, in addition, the Core API parses files based on the SAX parser principle, which ensures high performance.







The transformSingleFile



method may differ depending on the file modification tool selected. In my case, it looks pretty simple:







 private static void transformClass(String inputPath, String outputPath) { FileInputStream is = new FileInputStream(inputPath) ClassReader classReader = new ClassReader(is) ClassWriter classWriter = new ClassWriter(ClassWriter.COMPUTE_FRAMES) StaterClassVisitor adapter = new StaterClassVisitor(classWriter) classReader.accept(adapter, ClassReader.EXPAND_FRAMES) byte [] newBytes = classWriter.toByteArray() FileOutputStream fos = new FileOutputStream(outputPath) fos.write(newBytes) fos.close() }
      
      





We create ClassReader



for reading a file, we create ClassWriter



for writing a new file. I use ClassWriter.COMPUTE_FRAMES to automatically calculate stack frames, since Iโ€™ve more or less dealt with Locals and Args_size (bytecode terminology), but I havenโ€™t done much with frames yet. Automatically calculating frames is a bit slower than doing it manually.

Then create your StaterClassVisitor



, inheriting from ClassVisitor



and pass classWriter. It turns out that our file modification logic is superimposed on top of the standard ClassWriter. In the ASM library, all Visitor



entities are constructed in this way. Next, we form an array of bytes for the new file and generate the file.







Further, the details of my practical application of the theory studied will go.







Saving State in the Bundle Using Annotation



So, I set myself the task of getting rid of the data storage boilerplate in bundle as much as possible when recreating the Activity. I wanted to do everything like this:







 public class MainActivityJava extends AppCompatActivity { @State private int savedInt = 0;
      
      





But for now, in order to maximize efficiency, I did this (I will tell you why):







 @Stater public class MainActivityJava extends AppCompatActivity { @State(StateType.INT) private int savedInt = 0;
      
      





And it really works! After the transformation, the MainActivityJava



code looks like this:







 @Stater public class MainActivityJava extends AppCompatActivity { @State(StateType.INT) private int savedInt = 0; protected void onCreate(@Nullable Bundle savedInstanceState) { if (savedInstanceState != null) { this.savedInt = savedInstanceState.getInt("com/example/stater/MainActivityJava_savedInt"); } super.onCreate(savedInstanceState); } protected void onSaveInstanceState(@NonNull Bundle outState) { outState.putInt("com/example/stater/MainActivityJava_savedInt", this.savedInt); super.onSaveInstanceState(outState); }
      
      





The idea is very simple, let's move on to implementation.

The Core API does not allow you to have the full structure of the entire class file, we need to get all the necessary data in certain methods. If you look at StaterClassVisitor



, you can see that in the visit



method we get information about the class, in StaterClassVisitor



we check whether our class is marked with the @Stater



annotation.







Then our ClassVisitor



runs through all the fields of the class, calling the visitField



method, if the class needs to be transformed, our StaterFieldVisitor



:







 @Override FieldVisitor visitField(int access, String name, String descriptor, String signature, Object value) { FieldVisitor fv = super.visitField(access, name, descriptor, signature, value) if (needTransform) { return new StaterFieldVisitor(fv, name, descriptor, owner) } return fv }
      
      





StaterFieldVisitor



checks for the @State



annotation and, in turn, returns StateAnnotationVisitor



in the visitAnnotation



method:







 @Override AnnotationVisitor visitAnnotation(String descriptor, boolean visible) { AnnotationVisitor av = super.visitAnnotation(descriptor, visible) if (descriptor == Descriptors.STATE) { return new StateAnnotationVisitor(av, this.name, this.descriptor, this.owner) } return av }
      
      





Which already forms a list of fields necessary for saving / restoring:







 @Override void visitEnum(String name, String descriptor, String value) { String typeString = (String) value SaverField field = new SaverField(this.name, this.descriptor, this.owner, StateType.valueOf(typeString)) Const.stateFields.add(field) super.visitEnum(name, descriptor, value) }
      
      





It turns out the tree-like structure of our visitors, who, as a result, form a list of SaverField SaverField



with all the information we need to generate a save state.

Next, our ClassVisitor



begins to run through the methods and transform onCreate



and onSaveInstanceState



. If no methods are found, then in visitEnd



(called after passing the entire class) they are generated from scratch.







Where is the bytecode?



The most interesting part starts in the classes OnCreateVisitor



and OnSavedInstanceStateVisitor



. For correct modification of bytecode, it is necessary to at least slightly represent its structure. All methods and opcodes of ASM are very similar to the actual instructions of the batcode, this allows you to operate with the same concepts.

Consider an example of modifying the onCreate



method and compare it with the generated code:







 if (savedInstanceState != null) { this.savedInt = savedInstanceState.getInt("com/example/stater/MainActivityJava_savedInt"); }
      
      





Checking a bundle for zero is related to the following instructions:







 Label l1 = new Label() mv.visitVarInsn(Opcodes.ALOAD, 1) mv.visitJumpInsn(Opcodes.IFNULL, l1) //...      mv.visitLabel(l1)
      
      





In simple words:







  1. Create a label l1 (just a label that you can go to).
  2. We load into memory the reference variable with index 1. Since index 0 always corresponds to the reference to this, then in this case 1 is the reference to the Bundle



    in the argument.
  3. The zero check itself and the goto statement on the l1 label. visitLabel(l1)



    specified after working with the bundle.


When working with the bundle, we go over the list of generated fields and call the PUTFIELD



instruction - assignment to a variable. Let's look at the code:







 mv.visitVarInsn(Opcodes.ALOAD, 0) mv.visitVarInsn(Opcodes.ALOAD, 1) mv.visitLdcInsn(field.key) final StateType type = MethodDescriptorUtils.primitiveIsObject(field.descriptor) ? StateType.SERIALIZABLE : field.type MethodDescriptor methodDescriptor = MethodDescriptorUtils.getDescriptorByType(type, true) if (methodDescriptor == null || !methodDescriptor.isValid()) { throw new IllegalStateException("StateType for ${field.name} in ${field.owner} is unknown!") } mv.visitMethodInsn( Opcodes.INVOKEVIRTUAL, Types.BUNDLE, methodDescriptor.method, "(${Descriptors.STRING})${methodDescriptor.descriptor}", false ) // cast if (type == StateType.SERIALIZABLE || type == StateType.PARCELABLE || type == StateType.PARCELABLE_ARRAY || type == StateType.IBINDER ) { mv.visitTypeInsn(Opcodes.CHECKCAST, Type.getType(field.descriptor).internalName) } mv.visitFieldInsn(Opcodes.PUTFIELD, field.owner, field.name, field.descriptor)
      
      





MethodDescriptorUtils.primitiveIsObject



- here we check that the variable has a wrapper type, if so, consider the variable type as Serializable



. Then the getter from the bundle is called, casted if necessary and assigned to a variable.







That's all, code generation in the onSavedInstanceState



method happens in a similar way, example .







What problems did you encounter
  1. The first snag that @Stater



    annotation to be added. Your activity / fragment can be inherited from some BaseActivity



    , which greatly complicates the understanding of whether to save a state or not. You will have to go over all the parents of this class to find out that this is really an Activity. It can also reduce the performance of the compiler (in the future there is an idea to get rid of the @Stater



    annotation most effectively).
  2. The reason for explicitly specifying StateType



    is the same as the reason for the first snag. You need to further parse the class to understand that it is Parcelable



    or Serializable



    . But the plans already have ideas for getting rid of StateType



    :).


A little bit about performance



For verification, I created 10 activations, each ./gradlew :app:clean :app:assembleDebug



46 stored fields of different types, checked on the command ./gradlew :app:clean :app:assembleDebug



. The time taken by my transformation ranges from 108 to 200 ms.







Advice





To summarize



Modification of the source code is a powerful tool. With it, you can implement many ideas. Proguard, realm, robolectric and other frameworks work on this principle. AOP is also possible precisely thanks to code transformation.

And knowledge of the bytecode structure allows the developer to understand what the code he wrote is compiled in the end. And when modifying it is not necessary to think in what language the code is written, in Java or in Kotlin, but to modify the bytecode directly.







This topic seemed very interesting to me, the main difficulties were when developing the Transform API from Google, as they do not please with special documentation and examples. ASM, unlike the Transform API, has excellent documentation, has a very detailed guide in the form of a pdf file with 150 pages. And, since the framework methods are very similar to real bytecode instructions, the guide is doubly useful.







I think on this my immersion in transformation, bytecode, and now itโ€™s not all over, I will continue to study and, maybe, write something else.







References



Github example

ASM

Habr article about bytecode

A little more about bytecode

Transform API

Well, reading the documentation








All Articles