Build a cast iron walker on Spring Boot and AppCDS







Application Class Data Sharing (AppCDS) - JVM feature to speed up startup and save memory. Having appeared in its infancy in HotSpot back in JDK 1.5 (2004), for a long time it remained very limited, and even partially commercial. Only with OpenJDK 10 (2018) it was made available to mere mortals, at the same time expanding the scope. And recently released Java 13 tried to make this application simpler.







The idea of ​​AppCDS is to “share” once loaded classes between instances of the same JVM on the same host. It seems that this should be great for microservices, especially “broilers” on Spring Boot with their thousands of library classes, because now these classes will not need to be loaded (parsed and verified) at every start of each JVM instance, and they will not be duplicated in memory. This means that the launch should become faster, and the memory consumption should be lower. Wonderful, isn't it?







Everything is so, everything is so. But if you, the odnokhabryanin, used to believe not in the boulevard signs, but in specific numbers and examples, then welcome to kat - let's try to figure out how it really is ...







Instead of disclaimer



Before you is not a guide to the application of AppCDS, but a summary of the results of a small study. I was interested in understanding how this JVM function is applicable in my working project, and I tried to evaluate it from the perspective of an enterprise developer, setting forth the result in this article. This did not include topics such as using AppCDS on module-path, implementing AppCDS on other virtual machines (not HotSpot), and the intricacies of using containers. But there is a theoretical part for exploring the topic, as well as an experimental part written so that you can repeat the experience yourself. None of the results have yet been applied in production, but who knows what tomorrow will be like ...







Theory



A brief introduction to AppCDS



Acquaintance with this topic may have occurred to you in several sources, for example:









In order not to engage in retelling, I will highlight only a few points that are important for this article.







Firstly, AppCDS is an extension of the CDS feature that has long appeared in HotSpot, the essence of which is as follows:













To bring both ideas to life, you need to do the following (in general terms):







  1. Get a list of classes that you want to share between application instances
  2. Merge these classes into an archive suitable for memory mapping
  3. Connect archive to each application instance at startup


It would seem that the algorithm is just 3 steps - take it and do it. But here the news begins, all sorts of things.







The bad thing is that in the worst case, each of these items turns into at least one JVM launch with its own specific options, which means that the whole algorithm is a subtle juggling of the same type of options and files. That doesn't sound very promising, does it?







But there is good news: work to improve this algorithm is ongoing , and with each release of Java, its application becomes easier. For example:









Thus, no matter what the process of preparing AppCDS is, the 3 steps listed above are always behind it, just in some cases they are veiled.







As you probably noticed, with the advent of AppCDS, many application classes begin a double life: they live simultaneously in their former places (most often JAR files) and in a new shared archive. At the same time, the developer continues to change / remove / supplement them in the same place, and the JVM takes them from the new one when working. One does not need to be a diviner to see the danger of such a situation: if nothing is done, sooner or later copies of the classes will corrode, and we will get many charms of the typical “JAR hell”. It is clear that the JVM cannot prevent class changes, but it should be able to detect a discrepancy in time. However, to do this by pairwise comparing classes, even by checksums, is an idea; it can negate the rest of the increase in productivity. This is probably why JVM engineers did not select the individual classes as the comparison object, but the entire classpath, and stated in the AppCDS documentation: “The classpath when creating a shared archive should be the same (or at least a prefix) as with subsequent launches of the application.”







Note that the classpath used at archive creation time must be the same as (or a prefix of) the classpath used at run time.

But this is not an unambiguous statement, because, as you remember, a classpath can be formed in different ways, such as:









(this is not counting the fact that the classpath can also be set differently, for example, through the JVM options -cp/-classpath/--class-path



, the CLASSPATH



environment variable, or the attribute of the Class-Path



JAR file being launched)







Of these methods, only one is supported in AppCDS - explicit enumeration of JAR files. Apparently, the HotSpot JVM engineers felt that comparing classpaths in the AppCDS archive and in the launched application would be fast enough and reliable only if they were specified as clearly as possible - with a usual exhaustive list.







CDS / AppCDS supports archiving classes from JAR files only.

It is important to note here that this statement is not recursive, i.e. does not apply to JAR files inside JAR files (unless it is about Dynamic CDS, see below). And this means that the usual JAR-dolls issued by Spring Boot just do not work with regular AppCDS, you’ll have to sit down.







Another catch in the work of CDS is that shared archives are projected onto memory with fixed addresses (usually starting at 0x800000000



). This in itself is not bad, but since Address Space Layout Randomization (ASLR) is enabled by default on most operating systems, the required memory range may be partially occupied. What the HotSpot JVM does in this case is the special option -Xshare



that supports three values:









At the time of writing this article, Oracle is working on smoothing out such problems, but a release number has not yet been assigned.







These options are partially useful to us later, but for now let's look at ...







AppCDS Applications



There are several ways you can with AppCDS. ruin your life optimize the work of microservices. They vary greatly in complexity and potential profit, so it is important to immediately decide which one will be discussed later.







The simplest is to use not even AppCDS, but just CDS - this is when only platform classes get into the shared archive (see "A Brief Introduction to AppCDS"). We’ll delete this option right away, because when applied to microservices on Spring Boot it gives too little profit. This can be seen by the proportion of the number of shared classes in their total distribution using the example of one real microservice (see the green segment):













More complicated, but promising is the use of full-fledged AppCDS, that is, the inclusion of both library and application classes in the same archive. This is a whole family of options, which is derived from combinations of the number of participating applications and the number of instances. The following are subjective author's assessments of the benefits and complexity of various applications of AppCDS.







No. Applications Instances CPU profit RAM profit Complexity
one One One + ± Low
2 One Several ++ ++ Low
3 Several One by one ++ ++ High
4 Several A few +++ +++ High


Pay attention:









In this article, we will only reach option No. 2 (through No. 1), since it is simple enough for a close acquaintance with AppCDS and only to it without extra tricks we can use the recently released JEP-350 Dynamic CDS Archives, which I want to feel in action.







Dynamic CDS Archives



The JEP-350 Dynamic CDS Archives, one of the major innovations of Java 13, is designed to simplify the use of AppCDS. To feel the simplification, you must first understand the complexity. Let me remind you that the classic, “clean” algorithm for applying AppCDS consists of 3 steps: (1) get a list of shared classes, (2) create an archive from them, and (3) run the application with the archive connected. Of these steps, only the 3rd is actually useful, the rest is only preparation for it. And although getting a list of classes (step # 1) may seem very simple (in some cases it is even optional), in fact when working with non-trivial applications, it turns out to be the most difficult, especially with respect to Spring Boot. So JEP-350 is needed just in order to eliminate this step, or rather, automate it. The idea is that the JVM itself draws up a list of the classes needed by the application, and then itself forms the so-called “dynamic” archive from them. Agree, it sounds good. But the catch is that now it becomes unclear at what point to stop accumulating classes and proceed to place them in the archive. Previously, in the classic AppCDS, we chose this moment ourselves and even could wedge between these actions to change something in the list of classes before turning it into an archive. Now this is happening automatically and only at one moment, for which the JVM engineers have chosen, perhaps, the only compromise option - the regular shutdown of the JVM. This means that the archive will not be created until the application stops. This solution has a couple of important consequences:









An important difference between dynamic and static archives is that they always constitute an “add-on” over basic static archives, which can be either archives built into the Java distribution kit or created separately in a classic 3-step way.







Syntactically, using Dynamic CDS Archives boils down to two JVM launches with two options:







  1. Trial run with the -XX:ArchiveClassesAtExit=archive.jsa



    option -XX:ArchiveClassesAtExit=archive.jsa



    , at the end of which a dynamic archive will be created (you can specify any path and name)
  2. Useful launch with the -XX:SharedArchiveFile=archive.jsa



    option -XX:SharedArchiveFile=archive.jsa



    , which will use the previously created archive


The second option is no different from connecting a regular static archive. But if suddenly the basic static archive is not in the default location (inside the JDK), then this option may also include an indication of the path to it, for example:







 -XX:SharedArchiveFile=base.jsa:dynamic.jsa
      
      





(under Windows, the path separator must be the “;” character)







Now you know enough about AppCDS so you can look at it in action.







Practice



Guinea pig



So that our application of AppCDS in practice is not limited to a typical HelloWorld, we take as a basis the real application on Spring Boot. My colleagues and I often have to watch application logs on remote test servers, and watch “live”, just like they are written. To use for this a full-fledged log aggregator (like ELK) is often not appropriate; download log files endlessly - for a long time, and looking at tail



's gray console output is depressing. Therefore, I made a web application that can output any logs in real time directly to the browser, colorize lines by importance level (at the same time formatting XML), aggregate several logs into one, as well as other tricks. It is called ANALOG (such as a “log analyzer”, although this is not true) and lies on GitHub . Click on the screenshot to enlarge:













Technically, this is an application on Spring Boot + Spring Integration, under the hood of which tail



, docker



and kubectl



(to support logs from files, Docker containers and Kubernetes resources, respectively). It comes in the form of the classic “thick” Spring Boot JAR file. In runtime, ≈10K classes are hanging in the application memory, of which the vast majority are Spring and JDK classes. Obviously, these classes change quite rarely, which means that they can be put into a shared archive and reused in all instances of the application, saving memory and CPU.







Single experiment



Now let's apply the existing knowledge of Dynamic AppCDS to the experimental rabbit. Since everything is known in comparison, we will need some reference point - the state of the program with which we will compare the results obtained during the experiment.







Introductory remarks





Reference point



Let this point be the state of a freshly downloaded application, i.e. without explicit use of any AppCDS'ov and other. To evaluate it, we need:







  1. Install OpenJDK 13 (for example, the domestic Liberica distribution, but not the lite version).

    It also needs to be added to the PATH environment variable or to JAVA_HOME



    , for example, like this:







     export JAVA_HOME=~/tools/jdk-13
          
          





  2. Download ANALOG (at the time of writing, the latest version was v0.12.1).







    If necessary, you can specify in the config/application.yaml



    file in the server.address



    parameter the external host name for accessing the application (by default, localhost



    is specified there).







  3. Enable JVM class load logging.

    To do this, you can cock the JAVA_OPTS



    environment variable with this value:







     export JAVA_OPTS=-Xlog:class+load=info:file=log/class-load.log
          
          





    This option will be passed to the JVM and tells it to pledge the source of each class.







  4. Run a test run:







    1. Run the application with the bin/analog



      script
    2. Open http: // localhost: 8083 in the browser, poke buttons and daws
    3. Stop the application by pressing Ctrl+C



      in the bin/analog



      script console


  5. Take the result (from files in the log/



    directory)







    • Total number of classes loaded (by class-load.log



      ):







       cat class-load.log | wc -l 10463
            
            





    • How many of them are downloaded from a shared archive (according to it):







       grep -o 'source: shared' - class-load.log 1146
            
            





    • Average start time (after a series of starts; by analog.log



      ):







       grep -oE '\(JVM running for .+\)' analog.log | grep -oE '[0-9]\.[0-9]+' | awk '{ total += $1; count++ } END { print total/count }' 4.5225
            
            









So, at this step, the potential of CDS was 1146/10463=0,1095



≈11% . If you are surprised where the shared classes came from (after all, we have not yet included any AppCDS), then I remind you that starting from the 12th version, the JDK includes the finished CDS archive $JAVA_HOME/lib/server/classes.jsa



, built by no less than a ready list of classes:







 cat $JAVA_HOME/lib/classlist | wc -l 1170
      
      





Now, having assessed the initial state of the application, we can apply AppCDS to it and, by comparison, understand what this gives.







Core experience



As the documentation bequeathed to us, to create a dynamic AppCDS archive, you need to perform only one trial run of the application with the -XX:ArchiveClassesAtExit



option -XX:ArchiveClassesAtExit



. From the next launch, the archive can be used and receive profit. To verify this on the same experimental rabbit (Analog), you need:







  1. Add the specified option to the run command:







     export JAVA_OPTS="$JAVA_OPTS -XX:ArchiveClassesAtExit=work/classes.jsa"
          
          





  2. Extend logging:







     export JAVA_OPTS="$JAVA_OPTS -Xlog:cds=debug:file=log/cds.log"
          
          





    This option will force the process of building a CDS archive to be logged when the application is stopped.







  3. Carry out the same test run as with the reference point:







    1. Run the application with the bin/analog



      script
    2. Open http: // localhost: 8083 in the browser, poke buttons and daws
    3. Stop the application by pressing Ctrl+C



      in the bin/analog



      script console

      After that, a tremendous footcloth with all sorts of warning should fall into the console, and the log/cds.log



      should be filled with details; they don’t interest us yet.


  4. Switch the launch mode from trial to useful:







     export JAVA_OPTS="-XX:SharedArchiveFile=work/classes.jsa -Xlog:class+load=info:file=log/class-load.log -Xlog:class+path=debug:file=log/class-path.log"
          
          





    Here we do not supplement the JAVA_OPTS



    variable, but re-erase it with new values ​​that include (1) using a shared archive, (2) logging class sources and (3) logging class-path checks.







  5. Perform a useful launch of the application according to the scheme from paragraph 3.







  6. Take the result (from files in the log/



    directory)







    • Checking that AppCDS really applied (by class-path.log



      ):







       [0.011s][info][class,path] type=BOOT [0.011s][info][class,path] Expecting BOOT path=/home/upc/tools/jdk-13/lib/modules [0.011s][info][class,path] ok [0.011s][info][class,path] type=APP [0.011s][info][class,path] Expecting -Djava.class.path=/home/upc/tmp/analog/lib/analog.jar [0.011s][info][class,path] ok
            
            





      The ok



      marks after the lines type=BOOT



      and type=APP



      indicate the successful opening, verification and loading of the built-in and applied CDS archives, respectively.







    • Total number of classes loaded (by class-load.log



      ):







       cat class-load.log | wc -l 10403
            
            





    • How many of them are downloaded from a shared archive (according to it):







       grep -o 'source: shared' -c class-load.log 6910
            
            





    • Average start time (after a series of starts; by analog.log



      file):







       grep -oE '\(JVM running for .+\)' analog.log | grep -oE '[0-9]\.[0-9]+' | awk '{ total += $1; count++ } END { print total/count }' 4.04167
            
            









But at this step, the potential of CDS was already 6910/10403≈0,66



= 66% , that is, increased by 55% compared to the reference point. At the same time, the average launch time was reduced by (4,5225-4,04167)=0,48



seconds, i.e. start is faster by ≈10.6% of the initial value.







Results Analysis



The working title of the item is: “What is so small?”







We, like, did everything according to the instructions, but not all classes were in the archive. Their number affects the launch time no less than the computing power of the experimenter's machine, so we will concentrate on this number.







If you remember, we ignored the log/cds.log



file created during the stop of the experimental application after a trial run. In this HotSpot file, the JVM graciously noted warning classes for each class that did not appear in the CDS archive. Here are the total number of such marks:







 grep -o '[warning]' cds.log -c 3591
      
      





Considering that only 10K + classes are mentioned in the class-load.log



log and 66% of them are loaded from the archive, it’s not difficult to understand that 3600 classes listed in cds.log



are the “missing” 44% of the CDS potential. Now you need to find out why they were skipped.







If you look at the cds.log log, it turns out that there are only 4 unique reasons for skipping classes. Here are examples of each of them:







 Skipping org/springframework/web/client/HttpClientErrorException: Not linked Pre JDK 6 class not supported by CDS: 49.0 org/jrobin/core/RrdUpdater Skipping java/util/stream/Collectors$$Lambda$554: Unsafe anonymous class Skipping ch/qos/logback/classic/LoggerContext: interface org/slf4j/ILoggerFactory is excluded
      
      





Among all 3591 missed classes, these reasons are found with such frequency:













Take a closer look at them:









Output







CDS 100%.

, , , , , . .









JEP-310 , AppCDS JDK. . , . CDS (, , ) .







( ), - ; “ ”. Spring Boot, - ; JVM. ANALOG_OPTS



, Gradle'.







 export ANALOG_OPTS="-Djavamelody.enabled=false -Dlogging.config=classpath:logging/logback-console.xml" export ANALOG_OPTS="$ANALOG_OPTS -Dnodes.this.agentPort=7801 -Dserver.port=8091"
      
      





JavaMelody, , , . TCP- ; .







, , JVM AppCDS . JAVA_OPTS



JVM Unified Logging Framework :







 export JAVA_OPTS="-Xlog:class+load=info:file=log/class-load-%p.log -Xlog:class+path=debug:file=log/class-path-%p.log" export JAVA_OPTS="$JAVA_OPTS -XX:SharedArchiveFile=work/classes.jsa"
      
      





%p



, JVM (PID). AppCDS , ( ).









, . . :







  1. server.port



    nodes.this.agentPort



    , :







     export ANALOG_OPTS="$ANALOG_OPTS -Dnodes.this.agentPort=7801 -Dserver.port=8091"
          
          





    , ( ).







  2. bin/analog









    () http://localhost:8091 ,







  3. PID ( ), :







     pgrep -f analog 13792
          
          





  4. pmap



    ( ):







     pmap -XX 13792 | sed -n -e '2p;$p' Address Perm Offset Device Inode Size KernelPageSize MMUPageSize Rss Pss Shared_Clean Shared_Dirty Private_Clean Private_Dirty Referenced Anonymous LazyFree AnonHugePages ShmemPmdMapped Shared_Hugetlb Private_Hugetlb Swap SwapPss Locked ProtectionKey VmFlagsMapping 3186952 1548 1548 328132 325183 3256 0 10848 314028 212620 314024 0 0 0 0 0 0 0 325183 0 KB
          
          





    ; .







  5. 1-4 (, ).











pmap



. CDS' . , , PSS:







The "proportional set size" (PSS) of a process is the count of pages it has in memory, where each page is divided by the number of processes sharing it. So if a process has 1000 pages all to itself, and 1000 shared with one other process, its PSS will be 1500.

, , “ ” . , .







PSS , :







Iteration: one 2 3 4 5
PSS of inst#1: 339 088 313 778 305 517 301 153 298 604
PSS of inst#2: 314 904 306 567 302 555 299 919
PSS of inst#3: 314 914 311 008 308 691
PSS of inst#4: 306 563 304 495
PSS of inst#5: 294 686
Average: 339 088 314 341 308 999 305 320 301 279


, - :









, . AppCDS. , -XX:SharedArchiveFile=work/classes.jsa



-Xshare:off



, CDS . , .













:









. , , . , , JVM , Java- . GeekOut:













, , , AppCDS , .. Java-. , JVM, , - .







VisualVM Metaspace AppCDS , :







AppCDS













AppCDS













, 128 Metaspace AppCDS 64.2 MiB / 8.96 MiB



≈7,2 , CDS . (. ) 66.4 MiB / 13.9 MiB



≈4,8 . , AppCDS , Metaspace. Metaspace, , CDS .







Instead of a conclusion



Spring Boot AppCDS – JVM, .









, , AppCDS, , “killer feature”. Spring Boot. , , AppCDS . , , AppCDS Spring Boot. , …







by Nick Fewings on Unsplash








All Articles