As our Delivery team, using only our own resources, redid our system under CI / CD.
Teams of engineers are constantly under pressure: you need to give out new functions in the form of a worthy product and at the same time constantly minimize cycle time . Often, experts without grasping grab onto modern tools. Continuous Integration and Delivery (CI / CD) is built into GitLab, our only DevOps lifecycle application, and now we are migrating to Kubernetes to reduce cycle time even further. However, to the CI / CD - and ultimately Kubernetes - we did not follow the usual path. The Delivery team , moving us to the continuous delivery of GitLab.com, strained the old system, and only then we completely switched to Kubernetes.
How we released releases before CI / CD
Between August 7 and September 27, 2019, the huge GitLab community and our team members reached an average of 55 commits per day - these are continuous iterations of our product to create new features. But before we mastered continuous delivery, we used periods of freezing new functions starting on the 7th of each month: at this time our engineers switched their attention from developing new functions to debugging to preparing upcoming releases coming out stably on the 22nd of every month.
By introducing a strict deadline, we instilled in developers a behavior that helped them in the end to focus on a specific date, rather than focus on readiness.
"... the developers began to dance from the date of the 7th, because they looked at the calendar and thought: well, there’s still time, the 7th in a week, and then, at midnight on the 6th, they frantically merged," said Marine Jankowski , CTO of the Delivery team. “They know that if they break the deadlines, they will have to wait another month, and if they manage to do it in time, they will have two more good weeks to debug.”
Since the inception of GitLab.com, periods of freezing new features have been used as time to stabilize, ”explained Marine.
However, the number of users was growing and the need for new features made us accelerate the pace of development. The stabilization period slowed the cycle and greatly delayed the transition to debugging, regression and delivery of functions - both to users on Gitlab.com and individual customers.
"In some cases, [freezing new features] even provoked platform instability - due to the fact that top-priority fixes did not reach the client on time," said Marine. - "By switching to CI / CD, we deliver new products and debug them much faster."
Before we formed the Delivery team to move GitLab.com to continuous delivery — and ultimately to Kubernetes — we depended on the release manager . It was a transitional position among developers that was held by the one who was preparing the new release. We have been iterating the process for more than 5 years , but release managers have created a knowledge base and more or less automated it.
However, the method turned out to be ineffective, because the deployment schedule and preparation for release floated: from half a day to several days due to the fact that tasks for manual execution were accumulating in the process .
“The release manager received a fixed list of tasks, a deadline, and repeated the above steps over and over until a completely ready and stable release on GitLab.com was obtained,” said Marine. In the most general sense, the following was required from the release manager:
- Manually synchronizing the many repositories that make up GitLab.
- Make sure that the correct versions are included in manually created Git branches.
- When the release receives the tags, manually deploy the test and production environments to GitLab.com.
- Make sure everything works, and manually publish packages for individual users.
At a presentation in Brooklyn during the GitLab Commit devoted to this topic, Marine shared the results of observations for 2018: in the two-week period prior to release, the Delivery team spent 60% of the time interacting with deployments, another 26% spent on manual and semi-manual tasks like writing a review monthly release.
Observation results for 2018, before moving to continuous delivery: this is how the Delivery team spent time 2 weeks before the release.
"Speaking generally, then 14 days, two weeks before the release, the team only did what it was staring at the monitors, watching how the paint was drying, or something," said Marine.
But taking on 86% of this pie (60% for deployments + 26% for manual tasks), the Delivery team would solve a number of problems:
- New releases without delay.
- Repeatable and faster deployments without downtime.
- More time to migrate GitLab.com to Kubernetes.
- More freedom to prepare your organization for continuous delivery.
Although the CD is only available on GitLab.com, our individual customers also benefit from our transition to it. Now everything that is not affected by CI testing is tested automatically and manually in environments - before you get to GitLab.com. Everything that gets to GitLab.com and needs debugging will be debugged within a few hours. So the final release for individual customers will be clean.
The transition from freezing to CD was a matter of time, as our stack of functions grew, and a team of engineers under the control of Marina appeared to observe the transition: “The Delivery team was formed with the sole purpose of transferring the company to the CD model, and at the same time to transfer the company to the Kubernetes platform to facilitate scaling and speed up the cycle. "
Most companies in place of GitLab would start the transition to CI / CD and Kubernetes, first integrating new technologies into their workflow and correcting the development process in the process. We have chosen a different approach.
Migration to Kubernetes requires a change not only in the production system, but also in the development approach, ”said Marine. Kubernetes offers certain features that are available easily and without additional investment. But in order to really benefit from the free features offered by Kubernetes, you need some kind of CI / CD.
The Delivery team accepted this: in order to facilitate the transition to Kubernetes for continuous delivery, our engineers should already work with a focus on CI / CD, which implies enhanced quality control (QA) and more stringent function planning. And then our Delivery team made a gloomy decision : built a CD system with existing tools and reorganized the infrastructure of the GitLab.com application - instead of immediately mastering new CD tools and technologies.
“The idea was simple,” said Marine, “we use the tools available , automate most manual tasks and test the entire static system under load. If the static system withstands, we proceed to the dynamic test.”
This approach provided two key benefits:
Firstly , we identified all the weaknesses in our approach and stabilized them by automating with CI, so that our application got stronger, and the success of the transition to Kubernetes became more real.
Secondly , by setting up a team of engineers on CD, we implemented in the minds of GitLab engineers, who are used to deployments every week and wait, sometimes the whole day, as they affect the merge, is a real cultural shift.
“Ever since we mastered CI / CD, our developers began to understand in a new way what it means to be done,” said Marine.
Prior to the introduction of CI / CD, the change was considered ready as soon as the review was completed. This eliminated deployment to various environments, which was time consuming. Today, deployments are delivered within a few hours, so there is no reason not to confirm that the changes are operational in test and production environments.
Deploying Kubernetes review applications allows developers to run quality checks literally in real time, and using feature flags for progressive delivery also speeds up development.
“From the very first step in the CD, developers are required to respond to each automated quality control, and also to perform manual confirmations at a new level in both production and test environments. Plus, developers can make changes to the production environment within one day whereas before it took several days (or even weeks). "
With CD, you can conduct code quality checks much more often. And since with our CI / CD system changes in the code are delivered around the clock, the developers rotate on demand for any non-standard problems that appear in real time, since the "incubation period" has been greatly reduced.
Our new method
Having implemented the CI / CD system , we automated 90% of the process. The remaining 10% require human intervention - coordination is required between many people with the right of access.
"We are gradually reducing these 10% - so that we need only approval for the publication of the release," said Marine. In the current iteration, the CI / CD process works as follows :
- CI automatically searches for specific tags in merge requests approved by reviewers and developers.
- CI automatically synchronizes the required repositories and at the same time creates the necessary Git branches, tags, and also includes the correct release versions that we want to deliver.
- When builds are complete, packages are automatically deployed to test environments.
- Automatic quality checks are performed and, if all goes well, the deployment is delivered to a small segment of users in a production environment.
- In parallel with this, developers carry out quality control of a different level manually - to make sure that new functions work as they should.
- If a manual issue confirms a high-priority issue, deployments stop.
- When the previous step is completed, the Delivery team member will start the deployment delivery for all GitLab.com users.
- Then, based on the latest operational deployment launched on GitLab.com, an individual customer release is created.
As with any other engineering team, scaling is a real challenge for us. However, one of the biggest challenges for techies is to make sure that quality control covers everything, but for a large project like GitLab.com this is intense work. And you also need to make sure that there is enough monitoring and notification, so that the project does not work only on predefined rules.
The second major challenge for us is the complexity of the GitLab.com system and the transfer of changes directly in the process to all engineering teams. "Breaking the process and habits that have been established over the years is far from easy," said Marine.
results
GitLab already benefits a lot from switching to CI / CD.
Observations and evaluation in 2019 showed that in those same 14 days before the release, the Delivery team spends 82% of the time more productive: it was freed up to work on other important tasks.
Observation for 2019 showed that in the same 2 weeks a lot of such precious time was freed up from developers thanks to the transition to c CD.
By automating manual work, the Delivery team finally switched to changing the infrastructure of GitLab.com to improve support for development speed and user traffic, as well as migration to Kubernetes.
“And, as I’ve already said, all this is without Kubernetes. Everything was done on the old predecessor system,” Marine told guests of the Brooklyn GitLab Commit. - "But we won the time, so now my team is closely involved in migration. However, one of the biggest changes occurred precisely in the habit of organizing development."
The results after the transition are significant. If on the old system in May 2019 the team delivered somewhere 7 deployments, then in August 2019 this figure increased to 35. And this is not the limit: the numbers will increase significantly - now that the team delivers many deployments daily.
"We just migrated our Registry Service to Kubernetes, and if you use the container registry on GitLab.com , all your requests are executed on the Kubernetes platform," said Marine. - "GitLab is a multi-component system, and we continue to isolate and transfer other services."
Each new release includes new CI / CD features. For example, in release 12.3, we expanded the GitLab Container Registry - allowing users to use CI / CD and collect and embed images / tags into their own projects . There were other exciting new innovations.
Also transfer the system to continuous delivery?
Marin advised companies that are just about to switch to CD to start with what they have.
"As for me, sitting and waiting for migration to a new platform is to harm yourself," said Marine. “Most systems can be changed in some way and speed up the processing cycle without migrating to a completely new system. Accelerating the development / release cycle greatly increases the efficiency of each engineer in the system and frees up more time to migrate to a new platform, such as Kubernetes.”
If you're curious about what's next, take a look at this detailed summary of the exciting new CI / CD features that are waiting in the wings - with release 12.4 onwards.
Missed the Brooklyn GitLab Commit?
If you were unable to attend the Marina presentation with the background to our transition to Kubernetes, watch the full video below, and join us at the European GitLab Commit in London, October 9th .