JVM Internals, Part 2 - Class File Structure

Hello! A translation of the article was prepared specifically for students of the Java Developer course.








We continue to talk about how the Java Virtual Machine works internally. In the previous article (the original in English ), we examined the class loading subsystem. In this article we will talk about the structure of class files.



As we already know, all source code written in the Java programming language is first compiled into bytecode using the javac



compiler, which is part of the Java Development Kit. The bytecode is stored in a binary file in a special class file. Then these class-files are dynamically (if necessary) loaded into memory by the class loader (ClassLoader).





Figure - Java source code compilation



Each .java



file is compiled into at least one .class



file. For each class, interface, and module defined in the source code, one .class



file is created. This also applies to interfaces and nested classes.



Note - for simplicity, files with the extension .class



will be called “class files”.



Let's write a simple program.



 public class ClassOne{ public static void main(String[] args){ System.out.println("Hello world"); } static class StaticNestedClass{ } } class ClassTwo{ } interface InterfaceOne{ }
      
      





Running javac



for this file will result in the following files.



 ClassOne$StaticNestedClass.class ClassOne.class ClassTwo.class InterfaceOne.class
      
      





As you can see, a separate class file is created for each class and interface.



What is inside the class file?



The class file is in binary format. The information in it is usually written without indentation between consecutive pieces of information, everything is aligned with byte boundaries. All 16-bit and 32-bit values ​​are written using two or four consecutive 8-bit bytes.



The class file contains the following information.



Magic number, signature . The first four bytes of each class file are always 0xCAFEBABE



. These four bytes identify the Java class file.



File version. The next four bytes contain the major and minor versions of the file. Together, these numbers determine the version of the class file format. If the class file has a major major version of M and a minor m, then we designate this version as Mm



Each JVM has limitations on the supported versions of class files. For example, Java 11 supports major versions from 45 to 55, Java 12 - from 45 to 56.



Pool of constants. A table of structures representing string constants, class names, interfaces, fields, methods, and other constants that are in the ClassFile structure and its substructures. Each constant pool element begins with a single-byte tag that defines the type of constant. Depending on the type of constant, the following bytes may be an immediate constant value or a reference to another element in the pool.



Access flags. A list of flags that indicate the class is either an interface, public or private, the final class or not. Various flags such as ACC_PUBLIC



, ACC_FINAL



, ACC_INTERFACE



, ACC_ENUM



, etc. are described in the Java Virtual Machine Specification.



This class. Link to the entry in the constant pool.



Super class. Link to the entry in the constant pool.



Interfaces The number of interfaces implemented by the class.



The number of fields. The number of fields in the class or interface.



Fields. After the number of fields, a table of structures of variable length follows. One for each field with a description of the field type and name (with reference to the pool of constants).



Number of methods. The number of methods in the class or interface. This number includes only methods that are explicitly defined in the class, without methods inherited from superclasses.



Methods Next are the methods themselves. For each method, the following information is contained: the method descriptor (return type and argument list), the number of words needed for the local variables of the method, the maximum number of stack words needed for the method operand stack, the exception table caught by the method, method bytecodes and the table line numbers.



The number of attributes. The number of attributes in this class, interface, or module.



Attributes The number of attributes is followed by tables or variable-length structures that describe each attribute. For example, there is always a “SourceFile” attribute. It contains the name of the source file from which the class file was compiled.



Although the class file is not directly human-readable, there is a tool in the JDK called javap that displays its contents in a convenient format.



Let's write a simple Java program as shown below.



 package bytecode; import java.io.Serializable; public class HelloWorld implements Serializable, Cloneable { public static void main(String[] args) { System.out.println("Hello World"); } }
      
      





Let's compile this program with javac



, which will create the HelloWorld.class



file, and use javap



to view the HelloWorld.class



file. Running javap



with the -v (verbose)



option for HelloWorld.class



gives the following result:



 Classfile /Users/apersiankite/Documents/code_practice/java_practice/target/classes/bytecode/HelloWorld.class Last modified 02-Jul-2019; size 606 bytes MD5 checksum 6442d93b955c2e249619a1bade6d5b98 Compiled from "HelloWorld.java" public class bytecode.HelloWorld implements java.io.Serializable,java.lang.Cloneable minor version: 0 major version: 55 flags: (0x0021) ACC_PUBLIC, ACC_SUPER this_class: #5 // bytecode/HelloWorld super_class: #6 // java/lang/Object interfaces: 2, fields: 0, methods: 2, attributes: 1 Constant pool: #1 = Methodref #6.#22 // java/lang/Object."<init>":()V #2 = Fieldref #23.#24 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #25 // Hello World #4 = Methodref #26.#27 // java/io/PrintStream.println:(Ljava/lang/String;)V #5 = Class #28 // bytecode/HelloWorld #6 = Class #29 // java/lang/Object #7 = Class #30 // java/io/Serializable #8 = Class #31 // java/lang/Cloneable #9 = Utf8 <init> #10 = Utf8 ()V #11 = Utf8 Code #12 = Utf8 LineNumberTable #13 = Utf8 LocalVariableTable #14 = Utf8 this #15 = Utf8 Lbytecode/HelloWorld; #16 = Utf8 main #17 = Utf8 ([Ljava/lang/String;)V #18 = Utf8 args #19 = Utf8 [Ljava/lang/String; #20 = Utf8 SourceFile #21 = Utf8 HelloWorld.java #22 = NameAndType #9:#10 // "<init>":()V #23 = Class #32 // java/lang/System #24 = NameAndType #33:#34 // out:Ljava/io/PrintStream; #25 = Utf8 Hello World #26 = Class #35 // java/io/PrintStream #27 = NameAndType #36:#37 // println:(Ljava/lang/String;)V #28 = Utf8 bytecode/HelloWorld #29 = Utf8 java/lang/Object #30 = Utf8 java/io/Serializable #31 = Utf8 java/lang/Cloneable #32 = Utf8 java/lang/System #33 = Utf8 out #34 = Utf8 Ljava/io/PrintStream; #35 = Utf8 java/io/PrintStream #36 = Utf8 println #37 = Utf8 (Ljava/lang/String;)V { public bytecode.HelloWorld(); descriptor: ()V flags: (0x0001) ACC_PUBLIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 4: 0 LocalVariableTable: Start Length Slot Name Signature 0 5 0 this Lbytecode/HelloWorld; public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: (0x0009) ACC_PUBLIC, ACC_STATIC Code: stack=2, locals=1, args_size=1 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello World 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return LineNumberTable: line 7: 0 line 8: 8 LocalVariableTable: Start Length Slot Name Signature 0 9 0 args [Ljava/lang/String; } SourceFile: "HelloWorld.java"
      
      





Here you can see that the class is public and it has 37 entries in the constant pool. There is one attribute (SourceFile below), the class implements two interfaces (Serializable, Cloneable), it has no fields and there are two methods.



You may have noticed that there is only one static main method in the source code, but the class file says that there are two methods. Remember the default constructor - this is a no-argument constructor added by the javac



compiler, whose bytecode is also visible in the output. Constructors are considered as methods.



You can read more about javap here .



Tip : you can also use javap to see how lambdas differ from anonymous inner classes.



All Articles