🤵🏼 ❣️ 🕒 3-way merge in werf: deploy in Kubernetes with Helm "on steroids" 🐩 👨🏾‍⚕️ ✋🏾

Something happened that we (and not only us) had been waiting for: werf , our Open Source utility for building applications and delivering them to Kubernetes, now supports applying changes using 3-way-merge patches! In addition to this, it became possible to adopt existing K8s resources into Helm releases without re-creating these resources.

If it’s very short, then set WERF_THREE_WAY_MERGE=enabled

- we get the deployment “as in kubectl apply

”, compatible with existing installations on Helm 2 and even a little more.

But let's start with the theory: what are 3-way-merge patches in general, how did people come to the approach with their generation, and why are they important in CI / CD processes with Kubernetes-based infrastructure? And after that - let's see what 3-way-merge in werf is, what modes are used by default and how to manage it.

What is a 3-way-merge patch?

So, let's start with the task of rolling out the resources described in the YAML manifests in Kubernetes.

To work with resources, the Kubernetes API offers the following basic operations: create, patch, replace, and delete. It is assumed that with their help it is necessary to construct a convenient continuous rollout of resources to the cluster. How?

Imperative kubectl teams

The first approach to managing objects in Kubernetes is to use the imperative kubectl commands to create, modify, and delete these objects. Simply put:

kubectl run

command can run Deployment or Job:

 kubectl run --generator=deployment/apps.v1 DEPLOYMENT_NAME --image=IMAGE

kubectl scale

command - change the number of replicas:

 kubectl scale --replicas=3 deployment/mysql

etc.

Such an approach may seem convenient at first glance. However, there are problems:

It’s hard to automate .
How to reflect the configuration in Git? How to review changes occurring with a cluster?
How to ensure reproducibility of the configuration at restart?
...

It is clear that this approach does not fit well with storing application code and infrastructure as code (IaC; or even GitOps as a more modern option, gaining popularity in the Kubernetes ecosystem). Therefore, these teams did not receive further development in kubectl.

Create, get, replace, and delete operations

With the primary creation, everything is simple: we send the manifest to the create

operation of kube api and the resource is created. The YAML representation of the manifest can be stored in Git, and to create, use the kubectl create -f manifest.yaml

command.

Deleting is also simple: we substitute the same manifest.yaml

from Git into the kubectl delete -f manifest.yaml

command.

The replace

operation allows you to completely replace the resource configuration with a new one without recreating the resource. This means that before making a change to a resource, it is logical to request the current version with a get

operation, change it, and update with a replace

operation. Optimistic locking is built into kube apiserver, and if the object has changed after the get

operation, the replace

operation will fail.

To store the configuration in Git and update using replace, you need to do a get

operation, hold the config from Git with what we got, and perform replace

. Normally, kubectl only allows you to use the kubectl replace -f manifest.yaml

command, where manifest.yaml

is the fully prepared (in our case, adjoined) manifest that needs to be installed. It turns out that the user needs to implement merge manifests, but this is not a trivial matter ...

It is also worth noting that although manifest.yaml

is stored in Git, we cannot know in advance whether you need to create an object or update it - this should be done by user software.

Bottom line: can we build a continuous rollout only with create, replace and delete, ensuring that the infrastructure configuration is stored in Git with the code and a convenient CI / CD?

Basically, we can ... To do this, we need to implement the merge operation of the manifests and some kind of binding that:

checks for the presence of an object in the cluster,
performs the initial creation of the resource,
updates or deletes it.

When updating, it must be taken into account that the resource may have changed since the last get

and automatically handle the case of optimistic locking - make repeated attempts to update.

However, why reinvent the wheel when kube-apiserver offers another way to update resources: the patch

operation, which removes some of the problems described from the user?

Patch

So we got to the patches.

Patches are the primary way to apply changes to existing objects in Kubernetes. The patch

operation works so that:

kube-apiserver user needs to send the patch in JSON form and specify the object,
and apiserver itself will deal with the current state of the object and bring it to the desired form.

Optimistic locking in this case is not required. This operation is more declarative compared to replace, although at first it might seem the other way around.

Thus:

using the create

operation, we create an object from the manifest from Git,
using delete

- delete if the object is no longer required,
using patch

- we modify the object, bringing it to the form described in Git.

However, to do this, you must create the correct patch !

How patches work in Helm 2: 2-way-merge

The first time a release is installed, Helm performs a create

operation on chart resources.

When updating the Helm release for each resource:

counts the patch between the version of the resource from the previous chart and the current version of the chart,
applies this patch.

We will call such a patch 2-way-merge patch , because 2 manifestos participate in its creation:

Resource manifest from previous release,
The manifest of the resource from the current resource.

When deleting, the delete

operation in kube apiserver is called for resources that were declared in the previous release but not declared in the current one.

The approach with 2 way merge patch has a problem: it leads to a desync of the real state of the resource in the cluster and the manifest in Git .

An example of a problem

In Git, the chart stores a manifest in which the Deployment image

field has the ubuntu:18.04

value ubuntu:18.04

.
The user through kubectl edit

changed the value of this field to ubuntu:19.04

.
When you re-deploy the chart, Helm does not generate a patch , because the image

field in the previous version of the release and in the current chart are the same.
After the repeated deployment of image

, ubuntu:19.04

remains, although the chart says ubuntu:18.04

.

We got desync and lost declarativeness.

What is a synchronized resource?

Generally speaking, it is impossible to get full correspondence between a resource manifest in a running cluster and a manifest from Git. Because in the real manifest there may be service annotations / labels, additional containers and other data added and deleted from the resource dynamically by some controllers. We cannot and do not want to keep this data in Git. However, we want the fields that we explicitly specified in Git to take out the appropriate values when rolling out.

It turns out this general rule of a synchronized resource : when you roll out a resource, you can change or delete only those fields that are explicitly specified in the manifest from Git (or were registered in the previous version, but are now deleted).

3-way-merge patch

The main idea of the 3-way-merge patch : we generate a patch between the last applied version of the manifest from Git and the target version of the manifest from Git, taking into account the current version of the manifest from the working cluster. The final patch must comply with the synchronized resource rule:

new fields added to the target version are added using the patch;
previously existing fields in the last applied version and not existing in the target field are reset using the patch;
fields in the current version of the object that differ from the target version of the manifest are updated using the patch.

It is by this principle that kubectl apply

patches are generated:

the last applied version of the manifest is stored in the annotation of the object itself,
target - taken from the specified YAML file,
current - from a working cluster.

Now that we’ve figured out the theory, it's time to tell what we did in werf.

Apply changes to werf

Earlier, werf, like Helm 2, used 2-way-merge patches.

Repair patch

In order to switch to a new type of patches - 3-way-merge - the first step was to introduce the so-called repair patches .

When deploying, the standard 2-way-merge patch is used, but werf additionally generates a patch that synchronizes the real state of the resource with what is written in Git (such a patch is created using the same synchronized resource rule described above).

In the event of a rassynchron, at the end of the deployment, the user receives a WARNING with the appropriate message and patch, which must be applied in order to bring the resource to a synchronized form. Also, this patch is recorded in a special annotation werf.io/repair-patch

. It is assumed that the user will apply this patch by hand: werf will not apply it in principle.

Generating repair patches is a temporary measure that allows you to actually test the creation of patches on the principle of 3-way-merge, but do not automatically apply these patches. At the moment, this mode of operation is enabled by default.

3-way-merge patch for new releases only

Beginning December 1, 2019, beta- and alpha versions of werf begin to use full-fledged 3-way-merge patches by default to apply changes only for new Helm releases rolled out via werf. Existing releases will continue to use the 2-way-merge + repair patch approach.

This operating mode can be enabled explicitly by setting WERF_THREE_WAY_MERGE_MODE=onlyNewReleases

now.

Note : the feature appeared in werf over several releases: in the alpha channel it became ready from version v1.0.5-alpha.19 , and in the beta channel with v1.0.4-beta.20 .

3-way-merge patch for all releases

Beginning December 15, 2019, beta- and alpha versions of werf begin by default to use full-fledged 3-way-merge patches to apply changes for all releases.

This mode of operation can be explicitly WERF_THREE_WAY_MERGE_MODE=enabled

setting WERF_THREE_WAY_MERGE_MODE=enabled

now.

What to do with autoscaling resources?

Kubernetes has 2 types of autoscaling: HPA (horizontal) and VPA (vertical).

Horizontal automatically selects the number of replicas, vertical - the number of resources. Both the number of replicas and resource requirements are specified in the resource manifest (see spec.replicas

or spec.containers[].resources.limits.cpu

, spec.containers[].resources.limits.memory

and others ).

Problem: if a user configures a resource on the chart so that it displays specific values for resources or replicas and auto-scalers are enabled for this resource, then with each deploy werf will reset these values to what is written in the chart manifest.

There are two solutions to the problem. For starters, it’s best to discard explicitly specifying autoscale values in the chart manifest. If for some reason this option is not suitable (for example, because it is convenient to set the initial resource limits and the number of replicas in the chart), then werf offers the following annotations:

werf.io/set-replicas-only-on-creation=true
werf.io/set-resources-only-on-creation=true

If there is such an annotation, werf will not reset the corresponding values at each deployment, but only set them at the initial creation of the resource.

For more information, see the project documentation for HPA and VPA .

Deny use of 3-way-merge patch

The user can still prohibit the use of new patches in werf using the environment variable WERF_THREE_WAY_MERGE_MODE=disabled

. However, starting from March 1, 2020, this ban will stop working and it will only be possible to use 3-way-merge patches.

Adoption of resources in werf

Mastering the method of applying changes in 3-way-merge-patches allowed us to immediately implement such a feature as the adoption of resources existing in the cluster in the Helm-release.

Helm 2 has a problem: you cannot add a resource to a chart manifest that already exists in the cluster without re-creating this resource from scratch (see # 6031 , # 3275 ). We taught werf to accept existing resources in a release. To do this, you need to set an annotation on the current version of the resource from a working cluster (for example, using kubectl edit

):

 "werf.io/allow-adoption-by-release": RELEASE_NAME

Now the resource needs to be described on the chart and at the next deployment by the werf release of the release with the corresponding name, the existing resource will be accepted into this release and will remain under its control. Moreover, in the process of accepting the resource for release, werf will bring the current state of the resource from the working cluster to the state described on the chart using the same 3-way-merge patches and the synchronized resource rule.

Note : setting WERF_THREE_WAY_MERGE_MODE

does not affect the adoption of resources - in the case of adoption, a 3-way-merge patch is always used.

Details are in the documentation .

Conclusions and Future Plans

I hope that after this article it became clearer what 3-way-merge patches are and why they came to them. From a practical point of view of the development of the werf project, their implementation was another step towards improving the Helm-like deployment. Now you can forget about the problems with configuration synchronization, which often occurred when using Helm 2. At the same time, a new useful feature of the adoption of Kubernetes resources already uploaded to the Helm release was added.

There are still some problems and difficulties in the Helm-like deployment, such as the use of Go-templates, and we will continue to solve them.

Information on resource update methods and adoption can also be found on this documentation page .

Helm 3

A special note is worthy of the recently released new major version of Helm - v3 - which also uses 3-way-merge patches and gets rid of Tiller. The new version of Helm requires the migration of existing installations in order to convert them into a new release storage format.

Werf, for its part, has now eliminated the use of Tiller, switched to 3-way-merge and added a lot more , while remaining compatible with existing installations on Helm 2 (no migration scripts are necessary). Therefore, until werf is switched to Helm 3, werf users do not lose the main advantages of Helm 3 over Helm 2 (they also exist in werf).

However, switching werf to the Helm 3 codebase is inevitable and will happen in the near future. Presumably it will be werf 1.1 or werf 1.2 (at the moment, the main version of werf is 1.0; for more details on the werf versioning device see here ). During this time, Helm 3 will have time to stabilize.

3-way merge in werf: deploy in Kubernetes with Helm "on steroids"