Implement, scale: the experience of using autotests in VTB

Our division creates fully automatic pipelines for outputting new versions of applications to the production environment. Of course, this requires automated functional tests. Under the cut - the story of how, starting with testing in one thread on the local machine, we got to the multithreaded launch of self-tests on Selenoid in the assembly pipeline with the Allure report on GitLab pages and as a result we got a cool automation tool that future ones can use teams.







Where did we start



To implement autotests and embed them in a pipeline, we needed a framework for automation, which can be flexibly changed to suit our needs. Ideally, I wanted to get a single standard for the autotest engine, adapted for embedding autotests in a pipeline. For implementation, we have chosen the following technologies:









Why such a set? Java is one of the most popular languages ​​for autotests, in addition, all members of the team speak it. Selenium is the obvious solution. Cucumber, among other things, was supposed to increase the credibility of the results of autotests by the units involved in manual testing.



Single threaded tests



In order not to reinvent the wheel, we took the experience from various repositories on GitHub as the basis of the framework and adapted them for ourselves. We created a repository for the main library with the core of the autotest framework and a repository with a Gold example of implementing autotests on our core. Each team had to take a Gold-image and develop tests in it, adapting it to their project. We deployed to the GitLab-CI bank, on which we configured:





At first there were few tests, and they went in one stream. The single-threaded launch on the GitLab Windows-runner was quite suitable for us: the tests very slightly loaded the test stand and almost did not utilize resources.



Over time, autotests became more and more, and we thought about running them in parallel, when a full run began to take about three hours. Other issues appeared:





Autotest setup example:



<plugins> <plugin>     <groupId>org.apache.maven.plugins</groupId>     <artifactId>maven-surefire-plugin</artifactId>     <version>2.20</version>     <configuration>         <skipTests>${skipTests}</skipTests>         <testFailureIgnore>false</testFailureIgnore>         <argLine>             -javaagent:"${settings.localRepository}/org/aspectj/aspectjweaver/${aspectj.version}/aspectjweaver-${aspectj.version}.jar"             -Dcucumber.options="--tags ${TAGS} --plugin io.qameta.allure.cucumber2jvm.AllureCucumber2Jvm --plugin pretty"         </argLine>     </configuration>   <dependencies>         <dependency>             <groupId>org.aspectj</groupId>             <artifactId>aspectjweaver</artifactId>             <version>${aspectj.version}</version>         </dependency>     </dependencies> </plugin> <plugin>     <groupId>io.qameta.allure</groupId>     <artifactId>allure-maven</artifactId>     <version>2.9</version> </plugin> </plugins>
      
      







Allure Report Example





Runner load during tests (8 cores, 8 GB of RAM, 1 thread)



Pluses of single-threaded tests:





Cons of single-threaded tests:





JVM Fork Tests



Since when implementing the basic framework we did not take care of thread-safe code, the most obvious way to run in parallel was cucumber-jvm-parallel-plugin for Maven. The plugin is easy to configure, but for the correct parallel operation of the autotests you need to run in separate browsers. Nothing to do, I had to use Selenoid.



The Selenoid server was raised on a machine with 32 cores and 24 GB of RAM. The limit was set in 48 browsers - 1.5 threads per core and about 400 MB of RAM. As a result, the test time was reduced from three hours to 40 minutes. Speeding up runs helped solve the stabilization problem: now we could quickly run new autotests 20-30 times until we make sure that they are performed stably.

The first drawback of the solution was the high utilization of runner resources with a small number of parallel threads: on 4 cores and 8 GB of RAM, the tests worked stably in no more than 6 threads. The second minus: the plugin generates runner classes for each scenario, no matter how many are run.



Important! Do not throw a variable with tags in argLine , for example, like this:



 <argLine>-Dcucumber.options="--tags ${TAGS} --plugin io.qameta.allure.cucumber2jvm.AllureCucumber2Jvm --plugin pretty"</argLine> … Mvn –DTAGS="@smoke"
      
      





If you pass the tag in this way, the plugin will generate runners for all tests, that is, try to run all the tests, skipping them right after launch and creating a lot of JVM forks.



It’s correct to throw the variable with the tag into tags in the plugin settings, see the example below. Other methods we tested have problems connecting the Allure plugin.



Example run time for 6 short tests with incorrect settings:



 [INFO] Total time: 03:17 min
      
      





An example of test run time if you directly pass the tag to mvn ... –Dcucumber.options :



 [INFO] Total time: 44.467 s
      
      





Autotest setup example:



 <profiles> <profile>   <id>parallel</id>   <build>       <plugins>           <plugin>               <groupId>com.github.temyers</groupId>               <artifactId>cucumber-jvm-parallel-plugin</artifactId>               <version>5.0.0</version>               <executions>                   <execution>                       <id>generateRunners</id>                       <phase>generate-test-sources</phase>                       <goals>                           <goal>generateRunners</goal>                       </goals>                       <configuration>                     <tags>                           <tag>${TAGS}</tag>                           </tags>                           <glue>                               <package>stepdefs</package>                           </glue>                       </configuration>           </execution>               </executions>       </plugin>           <plugin>               <groupId>org.apache.maven.plugins</groupId>               <artifactId>maven-surefire-plugin</artifactId>           <version>2.21.0</version>               <configuration>                   <forkCount>12</forkCount>                   <reuseForks>false</reuseForks>                   <includes>**/*IT.class</includes>                  <testFailureIgnore>false</testFailureIgnore>                   <!--suppress UnresolvedMavenProperty -->                   <argLine> -javaagent:"${settings.localRepository}/org/aspectj/aspectjweaver/${aspectj.version}/aspectjweaver-${aspectj.version}.jar" -Dcucumber.options="--plugin io.qameta.allure.cucumber2jvm.AllureCucumber2Jvm TagPFAllureReporter --plugin pretty"                   </argLine>               </configuration>               <dependencies>                   <dependency>                       <groupId>org.aspectj</groupId>                       <artifactId>aspectjweaver</artifactId>                       <version>${aspectj.version}</version>                 </dependency>               </dependencies>         </plugin>       </plugins>   </build> </profile>
      
      







Example of an Allure report (the most unstable test, 4 rerana)



Runner load during tests (8 cores, 8 GB of RAM, 12 threads)



Pros:





Minuses:





How to beat instability



Test stands are not perfect, as are the autotests themselves. Not surprisingly, we got a number of flacky tests. The maven surefire plugin came to the rescue , which out of the box supports restarting fallen tests. You need to update the plugin version to at least 2.21 and write one line with the number of restarts in the pom-file or pass as an argument for Maven.



Autotest setup example:



   <plugin>       <groupId>org.apache.maven.plugins</groupId>    <artifactId>maven-surefire-plugin</artifactId>       <version>2.21.0</version>       <configuration>          ….           <rerunFailingTestsCount>2</rerunFailingTestsCount>           ….           </configuration> </plugin>
      
      





Or at startup: mvn ... -Dsurefire.rerunFailingTestsCount = 2 ...

Alternatively, set the Maven options for the PowerShell script (PS1):



  Set-Item Env:MAVEN_OPTS "-Dfile.encoding=UTF-8 -Dsurefire.rerunFailingTestsCount=2"
      
      





Pros:





Minuses:





Parallel Tests with the Cucumber 4 Library





The number of tests grew every day. We again thought about accelerating runs. In addition, I wanted to integrate as many tests as possible into the pipeline assembly of the application. A critical factor was the too long generation of runners when running in parallel using the Maven plugin.



Cucumber 4 was already released at that time, so we decided to rewrite the kernel for this version. In release notes, we were promised a parallel launch at the thread level. Theoretically, this should have been:





Optimizing the framework for multi-threaded autotests was not so difficult. Cucumber 4 runs each individual test in a dedicated thread from start to finish, so some common static things were simply converted to ThreadLocal variables.

The main thing when converting with Idea refactoring tools is to check the places in which the variable was compared (for example, checking for null). In addition, you need to render the Allure plugin in the annotations of the Junit Runner class.



Autotest setup example:



 <profile> <id>parallel</id> <build>   <plugins>       <plugin>           <groupId>org.apache.maven.plugins</groupId>     <artifactId>maven-surefire-plugin</artifactId>           <version>3.0.0-M3</version>      <configuration>               <useFile>false</useFile>               <testFailureIgnore>false</testFailureIgnore>           <parallel>methods</parallel>               <threadCount>6</threadCount>               <perCoreThreadCount>true</perCoreThreadCount>               <argLine>                   -javaagent:"${settings.localRepository}/org/aspectj/aspectjweaver/${aspectj.version}/aspectjweaver-${aspectj.version}.jar"               </argLine>           </configuration>           <dependencies>               <dependency>                   <groupId>org.aspectj</groupId>          <artifactId>aspectjweaver</artifactId>                   <version>${aspectj.version}</version>               </dependency>           </dependencies>       </plugin>   </plugins> </build> </profile>
      
      





An example of an Allure report (the most unstable test, 5 reranes)



Runner load during tests (8 cores, 8 GB of RAM, 24 threads)



Pros:





Minuses:





Allure reports in GitLab pages



After the introduction of multithreaded launch, we began to spend much more time on analyzing reports. At that time, we had to upload each report as an artifact in GitLab, then download it, unpack it. It is not very convenient and long. And if someone else wants to see the report at home, then he will need to do the same operations. We wanted to get feedback faster, and there was a way out - GitLab pages. This is a built-in feature that is available out of the box in all recent versions of GitLab. Allows you to deploy static sites on your server and access them via a direct link.



All screenshots with Allure reports were made in GitLab pages. The script for deploying the report to GitLab pages is to Windows PowerShell (before that you need to run autotests):



 New-Item -ItemType directory -Path $testresult\history | Out-Null try {Invoke-WebRequest -Uri $hst -OutFile $outputhst} Catch{echo "fail copy history"} try {Invoke-WebRequest -Uri $hsttrend -OutFile $outputhsttrnd} Catch{echo "fail copy history trend"} mvn allure:report #mvn assembly:single -PzipAllureReport xcopy $buildlocation\target\site\allure-maven-plugin\* $buildlocation\public /s /i /Y
      
      





What is the result



So, if you were thinking about whether you need Thread safe code in the Cucumber autotest framework, now the answer is obvious - with Cucumber 4 it is easy to implement it, thereby significantly increasing the number of threads launched simultaneously. With this method of running tests, the question is already about the performance of the machine with Selenoid and the test bench.



Practice has shown that running autotests on threads can minimize resource consumption at the best performance. As can be seen from the graphs, a 2-fold increase in flows does not lead to a similar acceleration in passing performance tests. Nevertheless, we were able to add more than 200 automatic tests to the application assembly, which even with 5 rerans can be completed in approximately 24 minutes. This allows you to receive quick feedback from them, and if necessary, make changes and repeat the procedure again.



All Articles