Resting to the maximum: from ORM to bytecode analysis

As you know, a real programmer must do 3 things in his life: create his own programming language, write his own operating system and make his ORM. And if I wrote the language a long time ago (maybe I’ll tell you some other time), and the OS is still waiting ahead, then I want to tell about ORM right now. And to be more precise, it’s not even about the ORM itself, but about the implementation of one small, local and, as it seemed, quite simple feature.







Together we will go all the way from the joy of finding a simple solution to the bitterness of the awareness of its fragility and incorrectness. From using only public API to dirty hacks. From "almost without reflection" to "knee-deep in the byte-code interpreter".







Who cares how to analyze the byte code, what difficulties it conceals in itself and what a stunning result can be obtained in the end, welcome under the cat.







Content



1 - How it all began

2-4 - On the way to bytecode.

5 - Who is baytkod.

6 - The analysis itself. It was for this chapter that everything was started and it was in her very intestines.

7- What else can I finish? Dreams Dreams…

Afterword - Afterword.







UPD: Immediately after the publication, parts 6-8 were lost (for the sake of which everything was started). Fixed









Part one. Problem



Imagine that we have a simple scheme. There is a client, he has several accounts. One of them is default. Also, a client can have several SIM cards and each SIM card can have an explicit account, or a default client can be used.













This is how this model is described in our code (omitting getters / setters / constructors / ...).







@JdbcEntity(table = "CLIENT") public class Client { @JdbcId private Long id; @JdbcColumn private String name; @JdbcJoinedObject(localColumn = "DEFAULTACCOUNT") private Account defaultAccount; } @JdbcEntity(table = "ACCOUNT") public class Account { @JdbcId private Long id; @JdbcColumn private Long balance; @JdbcJoinedObject(localColumn = "CLIENT") private Client client; } @JdbcEntity(table = "CARD") public class Card { @JdbcId private Long id; @JdbcColumn private String msisdn; @JdbcJoinedObject(localColumn = "ACCOUNT") private Account account; @JdbcJoinedObject(localColumn = "CLIENT") private Client client; }
      
      





In the ORM itself, we imposed a requirement for the absence of proxies (we must create an instance of this particular class) and a single request. Accordingly, this is what sql is sent to the database when trying to get a map.







 select CARD.id id, CARD.msisdn msisdn, ACCOUNT_2.id ACCOUNT_2_id, ACCOUNT_2.balance ACCOUNT_2_balance, CLIENT_3.id CLIENT_3_id, CLIENT_3.name CLIENT_3_name, CLIENT_1.id CLIENT_1_id, CLIENT_1.name CLIENT_1_name, ACCOUNT_4.id ACCOUNT_4_id, ACCOUNT_4.balance ACCOUNT_4_balance from CARD left outer join CLIENT CLIENT_1 on CARD.CLIENT = CLIENT_1.id left outer join ACCOUNT ACCOUNT_2 on CARD.ACCOUNT = ACCOUNT_2.id left outer join CLIENT CLIENT_3 on ACCOUNT_2.CLIENT = CLIENT_3.id left outer join ACCOUNT ACCOUNT_4 on CLIENT_1.DEFAULTACCOUNT = ACCOUNT_4.id;
      
      





Woops. Client and account duplicated. True, if you think about it, this is understandable - the framework does not know that the client of the card and the client of the card account are the same client. And the request must be generated statically and only one (remember the restriction to the uniqueness of the request?).







By the way, for exactly the same reason there are no Card.account.client.defaultAccount



and Card.client.defaultAccount.client



fields Card.client.defaultAccount.client



. Only we know that client



and client.defaultAccount.client



always match. And the framework does not know, for it is an arbitrary link. And what to do in such cases is not very clear. I know 3 options:







  1. Explicitly describe invariants in annotations.
  2. Make recursive queries ( with recursive



    / connect by



    ).
  3. Score.


Guess which option we choose? Right. As a result, all recursive fields are now not filled at all and there is always null.







But if you look closely, then duplication can be seen the second problem, and it is much worse. What did we want? Card number and balance. And what did you get? 4 joins and 10 columns. And this thing grows exponentially! Well, that is we really have a situation where, first of all, for the sake of beauty and integrity, we completely describe the model on annotations, and then, for the sake of 5 fields , there is a request for 15 joins and 150 columns . And at this moment it becomes really scary.









Part two. Workable but awkward solution



Immediately begs a simple solution. It is necessary to drag only those columns that will be used! Easy to say. The most obvious option (write select hands), we discard immediately. Well, not then did we describe the model so as not to use it. Quite a long time ago a special method was made - partialGet



. It, unlike simple get



, accepts List<String>



- names of fields which need to be filled. To do this, you first need to prescribe alias tables







 @JdbcJoinedObject(localColumn = "ACCOUNT", sqlTableAlias = "a") private Account account; @JdbcJoinedObject(localColumn = "CLIENT", sqlTableAlias = "c") private Client client;
      
      





And then enjoy the result.







 List<String> requiredColumns = asList("msisdn", "c_a_balance", "a_balance"); String query = cardMapper.getSelectSQL(requiredColumns, DatabaseType.ORACLE); System.out.println(query);
      
      





 select CARD.msisdn msisdn, c_a.balance c_a_balance, a.balance a_balance from CARD left outer join ACCOUNT a on CARD.ACCOUNT = a.id left outer join CLIENT c on CARD.CLIENT = c.id left outer join ACCOUNT c_a on c.DEFAULTACCOUNT = c_a.id;
      
      





And everything seems to be fine, but, in fact, no. Here is how it will be used in real code.







 Card card = cardDAO.partialGet(cardId, "msisdn", "c_a_balance", "a_balance"); ... ... ...    ... ... ... long clientId = card.getClient().getId();//, NPE.  , id    ?!
      
      





And it turns out that it is now possible to use partialGet only if the distance between it and the use of the result is only a few lines. But if the result goes far, or, God forbid, it is transmitted inside some method, then it is extremely difficult to understand later which fields are filled and which are not. Moreover, if NPE happened somewhere, then it is still necessary to understand whether it really came back from the null database, or whether we just did not fill in this field. In general, very unreliable.







You can, of course, just write another object with your own mapping specifically for the request, or even completely select the whole object and collect it in some Tuple



. Actually, it is actually now in most places that we do. But still I would like not to write with hands, and not to duplicate the mapping.









Part Three Convenient, but inoperative solution.



If you think a little more, then the answer comes rather quickly - you need to use interfaces. Then simply declare







 public interface MsisdnAndBalance { String getMsisdn(); long getBalance(); }
      
      





And use







 MsisdnAndBalance card = cardDAO.partialGet(cardId, ...);
      
      





And that's all. Do not cause anything extra. Moreover, with the transition to Kotlin / ten / lombok, even this terrible type can be eliminated. But here the most important moment is still omitted. What arguments need to be passed to partialGet



? Thongs, as before, no longer want, because the risk of being mistaken is too great and not writing the fields that are needed. And you want to be able to somehow







 MsisdnAndBalance card = cardDAO.partialGet(cardId, MsisdnAndBalance.class);
      
      





Or even better on the cotlin through reified generics







 val card = cardDAO.paritalGet<MsisdnAndBalance>(cardId)
      
      





Ehh, lyapota. Actually, the whole further story is the implementation of this option.









Part Four On the way to bytecode



The key problem is that methods come from the interface and annotations are above fields. And we need to find these fields by methods. The first and most obvious thought is to use the standard Java Bean convention. And for trivial properties, it even works. But it turns out very unstable. For example, it is necessary to rename the method in the interface (through ideological refactoring), as everything instantly falls apart. The idea is smart enough to rename the methods in the implementation classes, but not enough to understand that it was a getter and the field itself must also be renamed. And a similar decision leads to duplication of fields. For example, if I need a getClientId()



method in my interface, then I cannot implement it in the only correct way.







 public class Client implements HasClientId { private Long id; @Override public Long getClientId() { return id; } }
      
      





 public class Card implements HasClientId { private Client client; @Override public Long getClientId() { return client.getId(); } }
      
      





And I have to duplicate the fields. Both in Client



to drag both id



, and clientId



, and in the map next to the client, you must have clientId



. And follow that all this is not parted. Moreover, it would be desirable to getters with nontrivial logic, for example







 public class Card implements HasBalance { private Account account; private Client client; public long getBalance() { if (account != null) return account.getBalance(); else return client.getDefaultAccount().getBalance(); } }
      
      





So the option of searching by name disappears, you need something more cunning.







The next version was completely insane and did not last long in my head, but for the sake of completeness I will describe it as well. At the parsing stage, we can create an empty entity and just take turns writing some values ​​into fields, and then pull the getters and see what has changed or not. So we will see that the value of getClientId



does not change from the record in the name



field, but from the record id



- it changes. Moreover, a situation is automatically maintained here when a getter and a field of different types (of type isActive() = i_active != 0



). But there are at least three serious problems here (maybe more, but I did not think further).







  1. An obvious requirement for the entity with this algorithm is the return of the "same" value from the getter if the "corresponding" field has not changed. "Same" - from the point of view of the comparison operator chosen by us. ==



    they obviously cannot be (otherwise some getAsInt() = Integer.parseInt(strField))



    will stop working getAsInt() = Integer.parseInt(strField))



    . Remains equals. So, if a getter returns any user entity generated by fields on each call, then it must be redefined equals



    .
  2. Compressive mappings As in the example with int -> boolean



    above. If we check on the values ​​0 and 1, then we will see the change. But if at 40 and 42, then both times we get true.
  3. There can be complex converters in getters that count on certain invariants in fields (for example, a special string format). And on our data they will throw exceptions.


So in general, the option is also not working.







In the process of discussing this whole affair, I, initially jokingly, uttered the phrase "well, nafig, it's easier to see bytecode, everything is written there." At that moment I didn’t even think that this idea would devour me, and how far everything would go.









Part Five. What is baytkod and how it works



new #4, dup, invokespecial #5, areturn





If you understand what is written here and what this code does, then you can skip to the next section.







Disclaimer 1. Unfortunately, to understand the subsequent story requires at least a basic understanding of what java baytkod looks like, so I'll write a couple of paragraphs about it. In no way do I pretend to be complete.







Disclaimer 2. It will deal exclusively with the body methods. Neither about the constant pool, nor about the structure of the class as a whole, nor even about the declarations of the methods themselves, I will not say a word.







The main thing that you need to understand about bytecode is the assembler to the Java stack virtual machine. This means that the arguments for the instructions are taken from the stack and the return values ​​from the instructions are put back on the stack. From this point of view, we can say that bytecode is written in reverse Polish notation . In addition to the stack, there is an array of local variables in the method. When a method is entered into it, it is recorded this



and all the arguments of this method, and in the process of execution local variables are also stored there. Here is a simple example.







 public class Foo { private int bar; public int updateAndReturn(long baz, String str) { int result = (int) baz; result += str.length(); bar = result; return result; } }
      
      





I will write comments in the format







 # [(<local_variable_index>:<actual_value>)*], [(<value_on_stack>)*]
      
      





Top of the stack on the left.







 public int updateAndReturn(long, java.lang.String); Code: # [0:this, 1:long baz, 3:str], () 0: lload_1 # [0:this, 1:long baz, 3:str], (long baz) 1: l2i # [0:this, 1:long baz, 3:str], (int baz) 2: istore 4 # [0:this, 1:long baz, 3:str, 4:int baz], () 4: iload 4 # [0:this, 1:long baz, 3:str, 4:int baz], (int baz) 6: aload_3 # [0:this, 1:long baz, 3:str, 4:int baz], (str, int baz) 7: invokevirtual #2 // Method java/lang/String.length:()I # [0:this, 1:long baz, 3:str, 4:int baz], (length(str), int baz) 10: iadd # [0:this, 1:long baz, 3:str, 4:int baz], (length(str) + int baz) 11: istore 4 # [0:this, 1:long baz, 3:str, 4:length(str) + int baz], () 13: aload_0 # [0:this, 1:long baz, 3:str, 4:length(str) + int baz], (this) 14: iload 4 # [0:this, 1:long baz, 3:str, 4:length(str) + int baz], (length(str) + int baz, this) 16: putfield #3 // Field bar:I # [0:this, 1:long baz, 3:str, 4:length(str) + int baz], (),     bar 19: iload 4 # [0:this, 1:long baz, 3:str, 4:length(str) + int baz], (length(str) + int baz) 21: ireturn #  int   ,      
      
      





There are a lot of instructions. You need to look at the full list in the sixth chapter of JVMS , there is a brief retelling on wikipedia. A large number of instructions duplicate each other for different types (for example, iload



for inta and lload



for long). Also, for working with the 4 first local variables, their instructions are highlighted (in the example above, for example, there is lload_1



and it takes no arguments at all, but there is just lload



, it will take the local variable number argument. In the example above, there is a similar iload



)







Globally, we will be interested in the following groups of instructions:







  1. *load*



    , *store*



    - read / write local variable
  2. *aload



    , *astore



    - read / write array element by index
  3. getfield



    , putfield



    - read / write field
  4. getstatic



    , putstatic



    - read / write static field
  5. checkcast



    - castes between object types. Needed because on the stack and in local variables are typed values. For example, l2i was higher for the long -> int caste.
  6. invoke*



    - method call
  7. *return



    - return value and exit from the method




Part six. the main



For those who missed such a prolonged introduction, as well as to distract from the original problem and reason in terms of the library, we formulate the problem more accurately.







It is necessary, having a copy of java.lang.reflect.Method



on hand, to get a list of all non-static fields (both current and all nested objects) whose readings (directly or transitively) will be inside this method.

For example, for such a method







 public long getBalance() { Account acc; if (account != null) acc = account; else acc = client.getDefaultAccount(); return acc.getBalance(); }
      
      





You need to get a list of two fields: account.balance



and client.defaultAccount.balance



.







If possible, I will write a generalized solution. But in a couple of places you will have to use the knowledge of the original problem to solve unsolvable, in general, problems.







First you need to get the bytecode of the method body itself, but you cannot do it directly through Java. But since Initially, a method exists inside a class, it is easier to get the class itself. Globally, I know two options: wedge into the class loading process and intercept the already read byte[]



there, or just find the ClassName.class



file on the disk and read it. Interception of loading at the level of normal library is not done. You need to either connect javaagent, or use a custom ClassLoader. In any case, additional steps are required to configure the jvm / application, and this is inconvenient. You can do it easier. All "ordinary" classes are always in the same file with the extension ".class", the path to which is the class package. Yes, it’s not possible to find dynamically added classes or classes loaded by some kind of custom classifier, but we need it for the jdbc model, so we can safely say that all classes will be packed in the default way in jars. Total:







 private static InputStream getClassFile(Class<?> clazz) { String file = clazz.getName().replace('.', '/') + ".class"; ClassLoader cl = clazz.getClassLoader(); if (cl == null) return ClassLoader.getSystemResourceAsStream(file); else return cl.getResourceAsStream(file); }
      
      





Hooray, an array of bytes read. What will we do with it next? In principle, there are several libraries for reading / writing bytecode in Java, but ASM is usually used for the lowest level work. Since it is tuned for high performance and on-the-fly performance, the main visitor API is there - asm consistently reads the class and pulls the appropriate methods







 public abstract class ClassVisitor { public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) {...} public FieldVisitor visitField(int access, String name, String desc, String signature, Object value) {...} public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {...} ... } public abstract class MethodVisitor { protected MethodVisitor mv; public MethodVisitor(final int api, final MethodVisitor mv) { ... this.mv = mv; } public void visitJumpInsn(int opcode, Label label) { if (mv != null) { mv.visitJumpInsn(opcode, label); } } ... }
      
      





The user is invited to redefine his methods of interest and write his own analysis / transformation logic there. Separately, on the example of MethodVisitor



, I would like to draw attention to the fact that all visitors have a default implementation through delegation.







In addition to the main api out of the box, there is also a Tree API. If the Core API is an analogue of the SAX parser, then the Tree API is the equivalent of a DOM. We get an object inside which all the information about the class / method is stored and we can analyze it as we want with jumps to any place. In essence, this api is *Visitor



implementations that simply save information within the visit*



methods. Approximately all the methods there look like this:







 public class MethodNode extends MethodVisitor { @Override public void visitJumpInsn(final int opcode, final Label label) { instructions.add(new JumpInsnNode(opcode, getLabelNode(label))); } ... }
      
      





Now we, at last, can load a method for the analysis.







 private static class AnalyzerClassVisitor extends ClassVisitor { private final String getterName; private final String getterDesc; private MethodNode methodNode; public AnalyzerClassVisitor(Method getter) { super(ASM6); this.getterName = getter.getName(); this.getterDesc = getMethodDescriptor(getter); } public MethodNode getMethodNode() { if (methodNode == null) throw new IllegalStateException(); return methodNode; } @Override public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) { //      if (!name.equals(getterName) || !desc.equals(getterDesc)) return null; return new AnalyzerMethodVisitor(access, name, desc, signature, exceptions); } private class AnalyzerMethodVisitor extends MethodVisitor { public AnalyzerMethodVisitor(int access, String name, String desc, String signature, String[] exceptions) { super(ASM6, new MethodNode(ASM6, access, name, desc, signature, exceptions)); } @Override public void visitEnd() { //     ,     MethodVisitor    if (methodNode != null) throw new IllegalStateException(); methodNode = (MethodNode) mv; } } }
      
      





Full code reading method.

It is not returned directly by MethodNode



, but a wrapper with a pair of add. fields, because we will need them later too. The entry point (and the only public method) is readMethod(Method): MethodInfo



.







 public class MethodReader { public static class MethodInfo { private final String internalDeclaringClassName; private final int classAccess; private final MethodNode methodNode; public MethodInfo(String internalDeclaringClassName, int classAccess, MethodNode methodNode) { this.internalDeclaringClassName = internalDeclaringClassName; this.classAccess = classAccess; this.methodNode = methodNode; } public String getInternalDeclaringClassName() { return internalDeclaringClassName; } public int getClassAccess() { return classAccess; } public MethodNode getMethodNode() { return methodNode; } } public static MethodInfo readMethod(Method method) { Class<?> clazz = method.getDeclaringClass(); String internalClassName = getInternalName(clazz); try (InputStream is = getClassFile(clazz)) { ClassReader cr = new ClassReader(is); AnalyzerClassVisitor cv = new AnalyzerClassVisitor(internalClassName, method); cr.accept(cv, SKIP_DEBUG | SKIP_FRAMES); return new MethodInfo(internalClassName, cv.getAccess(), cv.getMethodNode()); } catch (IOException e) { throw new RuntimeException(e); } } private static InputStream getClassFile(Class<?> clazz) { String file = clazz.getName().replace('.', '/') + ".class"; ClassLoader cl = clazz.getClassLoader(); if (cl == null) return ClassLoader.getSystemResourceAsStream(file); else return cl.getResourceAsStream(file); } private static class AnalyzerClassVisitor extends ClassVisitor { private final String className; private final String getterName; private final String getterDesc; private MethodNode methodNode; private int access; public AnalyzerClassVisitor(String internalClassName, Method getter) { super(ASM6); this.className = internalClassName; this.getterName = getter.getName(); this.getterDesc = getMethodDescriptor(getter); } public MethodNode getMethodNode() { if (methodNode == null) throw new IllegalStateException(); return methodNode; } public int getAccess() { return access; } @Override public void visit(int version, int access, String name, String signature, String superName, String[] interfaces) { if (!name.equals(className)) throw new IllegalStateException(); this.access = access; } @Override public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) { if (!name.equals(getterName) || !desc.equals(getterDesc)) return null; return new AnalyzerMethodVisitor(access, name, desc, signature, exceptions); } private class AnalyzerMethodVisitor extends MethodVisitor { public AnalyzerMethodVisitor(int access, String name, String desc, String signature, String[] exceptions) { super(ASM6, new MethodNode(ASM6, access, name, desc, signature, exceptions)); } @Override public void visitEnd() { if (methodNode != null) throw new IllegalStateException(); methodNode = (MethodNode) mv; } } } }
      
      





It's time to do the analysis directly. How to do it? getfield



first thought is to watch all getfield



instructions. Each getfield



statically written which field it is and which class. It can be considered necessary for all fields of our class to which there was access. But unfortunately it does not work. The first problem here is that the excess is captured.







 class Foo { private int bar; private int baz; public int test() { return bar + new Foo().baz; } }
      
      





With this algorithm, we assume that the field baz is needed, although, in fact, no. But this problem could still be hammered. But what to do with the methods?







 public class Client implements HasClientId { private Long id; public Long getId() { HasClientId obj = this; return obj.getClientId(); } @Override public Long getClientId() { return id; } }
      
      





If we look for method calls just as we are looking for reading fields, we will not find getClientId



. For there is no Client.getClientId



call, but only HasClientId.getClientId



. You can, of course, consider all the methods used on the current class, all of its superclasses, and all interfaces, but this is a complete overkill. So you can accidentally and toString



capture, and in it is a printout of all fields in general.







Moreover, we want the calls of getters for the nested objects to work too.







 public class Account { private Client client; public long getClientId() { return client.getId(); } }
      
      





And here the call to the Client.getId



method does not apply to the class Account



at all.







With a lot of desire, it is still possible for some time to try hacks for particular cases, but quite quickly the understanding comes that β€œthings are not done this way” and you need to fully monitor the flow of execution and data movement. We should be interested in those and only those getfield



that are called either directly on this



, or on some kind of field from this



. Here is an example:







 class Client { public long id; } class Account { public long id; public Client client; public long test() { return client.id + new Account().id; } }
      
      





 class Account { public Client client; public long test(); Code: 0: aload_0 1: getfield #2 // Field client:LClient; 4: getfield #3 // Field Client.id:J 7: new #4 // class Account 10: dup 11: invokespecial #5 // Method "<init>":()V 14: getfield #6 // Field id:J 17: ladd 18: lreturn }
      
      







, Account.client.id



, Account.id



β€” . , , .







β€” , , aload_0



getfield



this



, , . , . . β€” ! -, . MethodNode



, ( ). , .. (//) .







:







 public class Analyzer<V extends Value> { public Analyzer(final Interpreter<V> interpreter) {...} public Frame<V>[] analyze(final String owner, final MethodNode m) {...} }
      
      





Analyzer



( Frame



, ) . , , , , //etc.







 public abstract class Interpreter<V extends Value> { public abstract V newValue(Type type); public abstract V newOperation(AbstractInsnNode insn) throws AnalyzerException; public abstract V copyOperation(AbstractInsnNode insn, V value) throws AnalyzerException; public abstract V unaryOperation(AbstractInsnNode insn, V value) throws AnalyzerException; public abstract V binaryOperation(AbstractInsnNode insn, V value1, V value2) throws AnalyzerException; public abstract V ternaryOperation(AbstractInsnNode insn, V value1, V value2, V value3) throws AnalyzerException; public abstract V naryOperation(AbstractInsnNode insn, List<? extends V> values) throws AnalyzerException; public abstract void returnOperation(AbstractInsnNode insn, V value, V expected) throws AnalyzerException; public abstract V merge(V v, V w); }
      
      





V



β€” , , . Analyzer



, , , . , getfield



β€” , , . , unaryOperation(AbstractInsnNode insn, V value): V



, . 1: getfield



Value



, " client



, Client



", 14: getfield



" β€” - , ".







merge(V v, V w): V



. , , . For example:







 public long getBalance() { Account acc; if (account != null) acc = account; else acc = client.getDefaultAccount(); return acc.getBalance(); }
      
      





Account.getBalance()



. - . . ? merge



.







β€” SuperInterpreter extends Interpreter<SuperValue>



? Right. SuperValue



. β€” , . , .







 public class Value extends BasicValue { private final Set<Ref> refs; private Value(Type type, Set<Ref> refs) { super(type); this.refs = refs; } } public class Ref { private final List<Field> path; private final boolean composite; public Ref(List<Field> path, boolean composite) { this.path = path; this.composite = composite; } }
      
      





composite



. , . , String



. String.length()



, , name



, name.value.length



. , length



β€” , , arraylength



. ? Not! β€” . , , , . , Date



, String



, Long



, . , , .







 class Persion { @JdbcColumn(converter = CustomJsonConverter.class) private PassportInfo passportInfo; }
      
      





PassportInfo



. , . , composite



. .







 public class Ref { private final List<Field> path; private final boolean composite; public Ref(List<Field> path, boolean composite) { this.path = path; this.composite = composite; } public List<Field> getPath() { return path; } public boolean isComposite() { return composite; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Ref ref = (Ref) o; return Objects.equals(path, ref.path); } @Override public int hashCode() { return Objects.hash(path); } @Override public String toString() { if (path.isEmpty()) return "<[this]>"; else return "<" + path.stream().map(Field::getName).collect(joining(".")) + ">"; } public static Ref thisRef() { return new Ref(emptyList(), true); } public static Optional<Ref> childRef(Ref parent, Field field, Configuration configuration) { if (!parent.isComposite()) return empty(); if (parent.path.contains(field))//    ,   return empty(); List<Field> path = new ArrayList<>(parent.path); path.add(field); return of(new Ref(path, configuration.isCompositeField(field))); } public static Optional<Ref> childRef(Ref parent, Ref child) { if (!parent.isComposite()) return empty(); if (child.path.stream().anyMatch(parent.path::contains))// ,   return empty(); List<Field> path = new ArrayList<>(parent.path); path.addAll(child.path); return of(new Ref(path, child.composite)); } }
      
      





 public class Value extends BasicValue { private final Set<Ref> refs; private Value(Type type, Set<Ref> refs) { super(type); this.refs = refs; } public Set<Ref> getRefs() { return refs; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; if (!super.equals(o)) return false; Value value = (Value) o; return Objects.equals(refs, value.refs); } @Override public int hashCode() { return Objects.hash(super.hashCode(), refs); } @Override public String toString() { return "(" + refs.stream().map(Object::toString).collect(joining(",")) + ")"; } public static Value typedValue(Type type, Ref ref) { return new Value(type, singleton(ref)); } public static Optional<Value> childValue(Value parent, Value child) { Type type = child.getType(); Set<Ref> fields = parent.refs.stream() .flatMap(p -> child.refs.stream().map(c -> childRef(p, c))) .filter(Optional::isPresent) .map(Optional::get) .collect(toSet()); if (fields.isEmpty()) return empty(); return of(new Value(type, fields)); } public static Optional<Value> childValue(Value parent, FieldInsnNode childInsn, Configuration configuration) { Type type = Type.getType(childInsn.desc); Field child = resolveField(childInsn); Set<Ref> fields = parent.refs.stream() .map(p -> childRef(p, child, configuration)) .filter(Optional::isPresent) .map(Optional::get) .collect(toSet()); if (fields.isEmpty()) return empty(); return of(new Value(type, fields)); } public static Value mergeValues(Collection<Value> values) { List<Type> types = values.stream().map(BasicValue::getType).distinct().collect(toList()); if (types.size() != 1) { String typesAsString = types.stream().map(Type::toString).collect(joining(", ", "(", ")")); throw new IllegalStateException("could not merge " + typesAsString); } Set<Ref> fields = values.stream().flatMap(v -> v.refs.stream()).distinct().collect(toSet()); return new Value(types.get(0), fields); } public static boolean isComposite(BasicValue value) { return value instanceof Value && value.getType().getSort() == Type.OBJECT && ((Value) value).refs.stream().anyMatch(Ref::isComposite); } }
      
      





, . Go!







 public class FieldsInterpreter extends BasicInterpreter {
      
      





, BasicInterpreter



. BasicValue



( , Value



extends BasicValue



) .







 public class BasicValue implements Value { public static final BasicValue UNINITIALIZED_VALUE = new BasicValue(null); public static final BasicValue INT_VALUE = new BasicValue(Type.INT_TYPE); public static final BasicValue FLOAT_VALUE = new BasicValue(Type.FLOAT_TYPE); public static final BasicValue LONG_VALUE = new BasicValue(Type.LONG_TYPE); public static final BasicValue DOUBLE_VALUE = new BasicValue(Type.DOUBLE_TYPE); public static final BasicValue REFERENCE_VALUE = new BasicValue(Type.getObjectType("java/lang/Object")); public static final BasicValue RETURNADDRESS_VALUE = new BasicValue(Type.VOID_TYPE); private final Type type; public BasicValue(final Type type) { this.type = type; } }
      
      





( (Value)basicValue



) , , ( " iconst



") .







newValue



. , , " ". , this



catch



. , , . BasicInterpreter



BasicValue(actualType)



BasicValue.REFERENCE_VALUE



. .







 @Override public BasicValue newValue(Type type) { if (type != null && type.getSort() == OBJECT) return new BasicValue(type); return super.newValue(type); }
      
      





entry point. this



. , - , , this



, BasicValue(actualType)



, Value.typedValue(actualType, Ref.thisRef())



. , , this



newValue



, . , .. , this



. this



. , . , this



0. , . , , . .







 @Override public BasicValue copyOperation(AbstractInsnNode insn, BasicValue value) throws AnalyzerException { if (wasUpdated || insn.getType() != VAR_INSN || ((VarInsnNode) insn).var != 0) { return super.copyOperation(insn, value); } switch (insn.getOpcode()) { case ALOAD: return typedValue(value.getType(), thisRef()); case ISTORE: case LSTORE: case FSTORE: case DSTORE: case ASTORE: wasUpdated = true; } return super.copyOperation(insn, value); }
      
      





Go ahead. . , , , β€” , . , .







 @Override public BasicValue merge(BasicValue v, BasicValue w) { if (v.equals(w)) return v; if (v instanceof Value || w instanceof Value) { if (!Objects.equals(v.getType(), w.getType())) { if (v == UNINITIALIZED_VALUE || w == UNINITIALIZED_VALUE) return UNINITIALIZED_VALUE; throw new IllegalStateException("could not merge " + v + " and " + w); } if (v instanceof Value != w instanceof Value) { if (v instanceof Value) return v; else return w; } return mergeValues(asList((Value) v, (Value) w)); } return super.merge(v, w); }
      
      





. ""? ? Not really. . , .. . , 3 ( ): putfield



, putstatic



, aastore



. . putstatic



( ) . , . putfield



aastore



. , , . ( ) . , . , β€” .







 public class Account { private Client client; public Long getClientId() { return Optional.ofNullable(client).map(Client::getId).orElse(null); } }
      
      





, ( ofNullable



Optional



client



value



), . In theory. . , - ofNullable(client)



, - map(Client::getId)



, .







putfield



, putstatic



aastore



.







 @Override public BasicValue binaryOperation(AbstractInsnNode insn, BasicValue value1, BasicValue value2) throws AnalyzerException { if (insn.getOpcode() == PUTFIELD && Value.isComposite(value2)) { throw new IllegalStateException("could not trace " + value2 + " over putfield"); } return super.binaryOperation(insn, value1, value2); } @Override public BasicValue ternaryOperation(AbstractInsnNode insn, BasicValue value1, BasicValue value2, BasicValue value3) throws AnalyzerException { if (insn.getOpcode() == AASTORE && Value.isComposite(value3)) { throw new IllegalStateException("could not trace " + value3 + " over aastore"); } return super.ternaryOperation(insn, value1, value2, value3); } @Override public BasicValue unaryOperation(AbstractInsnNode insn, BasicValue value) throws AnalyzerException { if (Value.isComposite(value)) { switch (insn.getOpcode()) { case PUTSTATIC: { throw new IllegalStateException("could not trace " + value + " over putstatic"); } ... } } return super.unaryOperation(insn, value); }
      
      





. checkcast



. : . β€”







 Client client1 = ...; Object objClient = client1; Client client2 = (Client) objClient;
      
      





, . , , client1



objClient



, . , checkcast



.







.







 class Foo { private List<?> list; public void trimToSize() { ((ArrayList<?>) list).trimToSize(); } }
      
      





. , , , . , , , , , . ? , ! . , , , null/0/false. . β€”







 @JdbcJoinedObject(localColumn = "CLIENT") private Client client;
      
      





, , ORM , . checkcast









 @Override public BasicValue unaryOperation(AbstractInsnNode insn, BasicValue value) throws AnalyzerException { if (Value.isComposite(value)) { switch (insn.getOpcode()) { ... case CHECKCAST: { Class<?> original = reflectClass(value.getType()); Type targetType = getObjectType(((TypeInsnNode) insn).desc); Class<?> afterCast = reflectClass(targetType); if (afterCast.isAssignableFrom(original)) { return value; } else { throw new IllegalStateException("type specification not supported"); } } } } return super.unaryOperation(insn, value); }
      
      





β€” getfield



. β€” ?







 class Foo { private Foo child; public Foo test() { Foo loopedRef = this; while (ThreadLocalRandom.current().nextBoolean()) { loopedRef = loopedRef.child; } return loopedRef; } }
      
      





, . ? child



, child.child



, child.child.child



? ? , . , . ,







null.

child, null, , , . Ref.childRef









 if (parent.path.contains(field)) return empty();
      
      





. , .







" ". . , . , , ( @JdbcJoinedObject



, @JdbcColumn



), , . ORM .







, getfield



, . , , , . No sooner said than done.







 @Override public BasicValue unaryOperation(AbstractInsnNode insn, BasicValue value) throws AnalyzerException { if (Value.isComposite(value)) { switch (insn.getOpcode()) { ... case GETFIELD: { Optional<Value> optionalFieldValue = childValue((Value) value, (FieldInsnNode) insn, configuration); if (!optionalFieldValue.isPresent()) break; Value fieldValue = optionalFieldValue.get(); if (configuration.isInterestingField(resolveField((FieldInsnNode) insn))) { context.addUsedField(fieldValue); } if (Value.isComposite(fieldValue)) { return fieldValue; } break; } ... } } return super.unaryOperation(insn, value); }
      
      





. , , . , invoke*



. , , , . , :







 public long getClientId() { return getClient().getId(); }
      
      





, , . , . . , . ? . . .







 class Account implements HasClient { @JdbcJoinedObject private Client client; public Client getClient() { return client; } }
      
      





Account.client



. , . . β€” , .







 public static class Result { private final Set<Value> usedFields; private final Value returnedCompositeValue; }
      
      





? , . . , .. ( , β€” ), , areturn



, , , *return



. MethodNode



( , Tree API) . . β€” . , ? . .







 private static Value getReturnedCompositeValue(Frame<BasicValue>[] frames, AbstractInsnNode[] insns) { Set<Value> resultValues = new HashSet<>(); for (int i = 0; i < insns.length; i++) { AbstractInsnNode insn = insns[i]; switch (insn.getOpcode()) { case IRETURN: case LRETURN: case FRETURN: case DRETURN: case ARETURN: BasicValue value = frames[i].getStack(0); if (Value.isComposite(value)) { resultValues.add((Value) value); } break; } } if (resultValues.isEmpty()) return null; return mergeValues(resultValues); }
      
      





analyzeField









 public static Result analyzeField(Method method, Configuration configuration) { if (Modifier.isNative(method.getModifiers())) throw new IllegalStateException("could not analyze native method " + method); MethodInfo methodInfo = readMethod(method); MethodNode mn = methodInfo.getMethodNode(); String internalClassName = methodInfo.getInternalDeclaringClassName(); int classAccess = methodInfo.getClassAccess(); Context context = new Context(method, classAccess); FieldsInterpreter interpreter = new FieldsInterpreter(context, configuration); Analyzer<BasicValue> analyzer = new Analyzer<>(interpreter); try { analyzer.analyze(internalClassName, mn); } catch (AnalyzerException e) { throw new RuntimeException(e); } Frame<BasicValue>[] frames = analyzer.getFrames(); AbstractInsnNode[] insns = mn.instructions.toArray(); Value returnedCompositeValue = getReturnedCompositeValue(frames, insns); return new Result(context.getUsedFields(), returnedCompositeValue); }
      
      





, -, . invoke*



. 5 :







  1. invokespecial



    β€” . , , ( super.call()



    ).
  2. invokevirtual



    β€” . . , .
  3. invokeinterface



    β€” , invokevirtual



    , β€” .
  4. invokestatic



    β€”
  5. invokedynamic



    β€” , 7 JSR 292. JVM, invokedynamic



    ( dynamic). , (+ ), . , Invokedynamic: ? .


, , , . invokedynamic



, . , , , (, ), invokedynamic



. , "" . , invokedynamic



, .







Go ahead. , . , . , this



, 0? , - , FieldsInterpreter



copyOperation



. , MethodAnalyzer.analyzeFields



" this



" " " ( this



β€” ). , . , , . , - . , (- Optional.ofNullable(client)



). .







, invokestatic



(.. , this



). invokespecial



, invokevirtual



invokeinterface



. , . , , jvm. invokespecial



, , . invokevirtual



invokeinterface



. , .







 public String objectToString(Object obj) { return obj.toString(); }
      
      





 public static java.lang.String objectToString(java.lang.Object); Code: 0: aload_0 1: invokevirtual #104 // Method java/lang/Object.toString:()Ljava/lang/String; 4: areturn
      
      





, , ( ) . , , . ? ORM . ORM , , . invokevirtual



invokeinterface



.







Hooray! . What's next? , ( , this



), ( , ) . !







  @Override public BasicValue naryOperation(AbstractInsnNode insn, List<? extends BasicValue> values) throws AnalyzerException { Method method = null; Value methodThis = null; switch (insn.getOpcode()) { case INVOKESPECIAL: {...} case INVOKEVIRTUAL: {...} case INVOKEINTERFACE: { if (Value.isComposite(values.get(0))) { MethodInsnNode methodNode = (MethodInsnNode) insn; Class<?> objectClass = reflectClass(values.get(0).getType()); Method interfaceMethod = resolveInterfaceMethod(reflectClass(methodNode.owner), methodNode.name, getMethodType(methodNode.desc)); method = lookupInterfaceMethod(objectClass, interfaceMethod); methodThis = (Value) values.get(0); } List<?> badValues = values.stream().skip(1).filter(Value::isComposite).collect(toList()); if (!badValues.isEmpty()) throw new IllegalStateException("could not pass " + badValues + " as parameter"); break; } case INVOKESTATIC: case INVOKEDYNAMIC: { List<?> badValues = values.stream().filter(Value::isComposite).collect(toList()); if (!badValues.isEmpty()) throw new IllegalStateException("could not pass " + badValues + " as parameter"); break; } } if (method != null) { MethodAnalyzer.Result methodResult = analyzeFields(method, configuration); for (Value usedField : methodResult.getUsedFields()) { childValue(methodThis, usedField).ifPresent(context::addUsedField); } if (methodResult.getReturnedCompositeValue() != null) { Optional<Value> returnedValue = childValue(methodThis, methodResult.getReturnedCompositeValue()); if (returnedValue.isPresent()) { return returnedValue.get(); } } } return super.naryOperation(insn, values); }
      
      





. , . , JVMS 1 1. . β€” . , , . , .. , - , 2 β€” . , , . , β€” . , ResolutionUtil LookupUtil .







!









.



, 80% 20% 20% 80% . , , , ?











Afterword



- , . - partialGet



. , . , , , , " " .







, .








All Articles