How to cook porridge from microservices

One of the reasons for the popularity of microservices is the possibility of autonomous and independent development. In essence, microservice architecture is the exchange of the possibility of autonomous development for a more complex (as compared to a monolith) deployment, testing, debugging and monitoring. But keep in mind that microservices do not forgive the separation of responsibilities. If the separation of duties is incorrect, frequent dependent changes occur in different services. And this is much more painful and complicated than the coordinated changes within the framework of different modules or packages inside the monolith. Consistent changes in microservices are complicated by consistent layout, deployment, testing, etc.

And I would like to talk about the various patterns and antipatterns of the division of responsibilities into microservices.

Service entity as antipattern

“Service Entity” is one of the possible (anti) patterns of designing a microservice architecture, which leads to highly-dependent code in different services and loosely coupled within services.

For most developers, it seems that when selecting services according to the essence of the subject area: “deal”, “person”, “client”, “order”, “picture”, he follows the principles of sole responsibility, and moreover, often this seems logical. But the service-entity approach may turn into an antipattern. This happens because most features or changes affect several entities, and not one. As a result, each such service combines the logic of different business processes.

For example, take an online store. We decided to highlight the services “product”, “order”, “client”.

What changes and services should I make to add home delivery?

For example, you can do this:

in the service “order” add the delivery address, the desired time and the delivery man
in the client service add a list of selected delivery addresses for the client
in the service “product” add an entity list of goods

For the provider’s interface, it will be necessary to make a separate API method in the “order” service, which will give a list of orders assigned to this particular provider. In addition, methods will be needed to remove goods from the order that did not fit or which the client refused at the time of delivery.

Or what changes and in which services do I need to make in order to add discounts on the promotional code?

At a minimum you need:

add a promotional code to the “order” service
in the “product” service add whether discounts apply on the promotional code for this product
in the client service add a list of promotional codes that were issued to the client

In the manager’s interface, adding a personalized promotional code to the client is a separate method in the client service, which is available only to store managers, but not available to the client himself. And in the “product” service, make a method that gives a list of products that are affected by the promotional code, so that it is easier for the client to choose in his interface.

The sources of changes in the service can be several business processes - selection and design, payment and billing, delivery. Each of the problem areas has its own limitations, invariants and requirements for the order. As a result, it turns out that in the “product” service we store information about the product, about discounts, and product balances in warehouses. And in the "order" is stored the logic of the delivery man.

In other words, a change in business logic that is spread across several services leads to dependent changes in several services. And at the same time in one service is a code that is not connected to each other.

Storage Services

It seems that this problem can be solved if a separate “layer” service is created over entity services, which encapsulate the entire logic. But usually this also ends badly. Because then entity services become storage services, i.e. all business logic is washed out of them, except for storage.

If the data is stored in different databases, on different machines, then we

we lose performance because we don’t give data directly from the database, but through the service layer
we lose flexibility because the service API is usually much less flexible than SQL or any other query language
we lose in flexibility, because it is difficult to make data merges from different services

If different entity services have access to other databases, then communication between services occurs implicitly - through a common database, then to make any change affecting a data schema change it is possible only after checking that this change will not break all other services that use this database or tablet .

In addition to complex development, such services become overly critical and heavily loaded - with almost every request of a top-level service, you have to make several requests to different service entities, which means that editing them becomes even more difficult in order to satisfy the increased reliability and performance requirements.

Because of such difficulties with the development and support of entity services in their pure form, you rarely see a pattern; usually entity services turn into one or two central “microservice-monoliths”, which often change and contain the main business logic and placers of small microservices that are usually infrastructure and small ones that rarely change.

Separation by problem areas

Changes in themselves are not born, they come from some problem area. A problem area is a task area within which problems requiring changes in the code are formulated in one language, using one set of concepts or interconnected by business logic. Accordingly, within the framework of one problem area, there will most likely be one set of restrictions, invariants, which you can rely on when writing code.

Separation of responsibility of services by problem areas, rather than by entities, usually leads to a more supported and understandable architecture. Problematic areas most often correspond to business processes. For the online store, the most likely problem areas will be “payment and billing”, “delivery”, “order process”.

Changes that would affect several problem areas at the same time are less than changes that would affect several entities.

In addition, services broken down by business processes can be reused in the future. For example, if next to the online store we wanted to make another sale of plane tickets, we could reuse the general service “Billing and Payment”. And do not make another similar, but specific for the sale of tickets.

For example, we can thus divide into services:

A service or a group of services “Delivery”, which will store the logic of work with the delivery of a specific order, the organization of the work of suppliers, an assessment of the quality of their work, the mobile application of the supplier, etc.
A service or a group of services “Billing and Payment”, which will store the logic of work with payment, payment accounts for legal entities, generation of contracts and closing documents.
Service or group of services “Order Process”, which stores the logic of the customer’s choice of products, cataloging, brands, basket logic, etc.
Service “authorization and authentication”.
It may even make sense to separate the discount service.

To interact with each other, services can use the event model or exchange simple objects with each other (restful api, grpc, etc.). True, it is worth noting that it is not easy to organize the interaction between such services correctly. At a minimum, decentralization of data has problems with consistency sometime (eventual consistency) and transactionality (in the case when it is important).

Decentralization of data, the exchange of simple objects has its pros, cons and pitfalls. On the one hand, decentralization makes it possible to independently develop and operate several services. On the other hand, the cost of storing two or three copies of data and maintaining consistency in different systems.

In real life, something often occurs in between. A service entity with a minimal set of attributes that is used by all services by consumers. And some minimal layer of logic - for example, a status model, and events in the queue with notification of all changes in the entity. At the same time, services consumers still pretty often keep a “cache” of data. Everything possible is being done so that there are as few changes as possible in such a service, and this, in principle, is difficult to do due to the fact that there are a lot of consumers.

At the same time, it is important to understand that any partition - both by entity and by problem area - is not a silver bullet, there will always be features that will require dependent changes in several services. It’s just that with one breakdown there will be much more such changes than with another. And the task of development is to minimize the number of dependent changes.

An ideal split is only possible if you have two completely independent products. In any business you have everything connected with everything, the only question is how much is connected.

And the question is in separation of responsibilities and in the height of barriers to abstractions.

Service API Design

Designing interfaces within the service repeats the history with the breakdown into services, only on a smaller scale. Changing the interface (and not just an extension) is complex and time consuming. In complex applications, the interface should be universal enough not to cause constant changes, and should be specific and specific enough not to cause the spread of responsibility and semantics.

Therefore, service interfaces must be designed so that their semantics are resistant to changes. And this is possible if the semantics or area of responsibility of the interface relied on the limitations of the problem area.

CRUD interfaces for services with complex business logic

An interface that is too wide and nonspecific contributes to either erosion of responsibility or excessive complexity.

For example, CRUD API for services with complex business logic. Such interfaces do not encapsulate behavior. They not only enable business logic to leak into other services and erode the responsibility of the service, they provoke the spread of business logic - restrictions, invariants and methods of working with data are now in other services. Interface user services (APIs) must implement the logic themselves.

If we try, without changing the interface significantly, to transfer the business logic to the service, we will get a too universal and too complicated method.

For example, there is a ticket service. A ticket can be of different types. Each type has a different set of fields and a slightly different validation. The ticket also has a status model - a state machine for transition from one status to another.

Let the API look like this: POST / PATCH / GET methods, url /api/v1/tickets/{ticket_id►.json

So, you can update the ticket

PATCH /api/v1/tickets/{ticket_id}.json { "type": "bug", "status": "closed", "description": "   " }

In case the status model will depend on the ticket, then conflicts of business logic are possible. First, change the status according to the old status model, and then change the type of ticket. Or vice versa?

It turns out that inside the API method there will be code that is not connected with each other - changing entity fields, a list of available fields, depending on the type of ticket, and a status model. They change for various reasons and it makes sense to distribute them according to different API methods and interfaces.

If changing a field within the framework of the API’s CRUD methods is not just a data change, but an operation related to a coordinated change in the state of an entity, then this operation should be transferred to a separate method and not allowed to be changed directly. If changing the API without backward compatibility is very bad (for public APIs), then it’s better to think about it right away when designing the API.

Therefore, in order to avoid such problems, it is better to make the interfaces small, specific and as problem-oriented as possible, instead of universal data-centric ones.

This (anti) pattern is more often characteristic of RESTful interfaces, due to the fact that by default there are only a few data-centric “verbs” of actions to create, delete, update, read. No business-specific entity operations

What can be done to make RESTful more problem-oriented?

First, you can add methods to entities. The interface is becoming less restful. But there is such an opportunity. We still do not fight for the purity of the race, but solve practical problems

Instead of the universal resource /api/v1/tickets.json

add more resources:

/api/v1/tickets/{ticket_id}/migrate.json

- migrate from one type to another

/api/v1/tickets/{ticket_id}/status.json

- if there is a status model

Secondly, you can imagine any operation as a resource within the framework of REST. Is there a ticket migration operation from one type to another (or from one project to another?). Ok, so there will be a resource

/api/v1/tickets/migration.json

Is there a business operation to create a trial subscription?

/api/v1/subscriptions/trial.json

Is there a money transfer operation?

/api/v1/money_transfers.json

Etc.

The antipattern with the data-centric API actually refers to rpc interaction as well. For example, the presence of too general methods like editAccount (), or editTicket (). “Modify an object” does not carry the semantic load associated with the problem area. This means that this method will be called for various reasons, for various reasons to change.

It should be noted that data-centric interfaces are quite ok, if the problem area involves only storing, receiving and modifying data.

Event model

One way to untie pieces of code is to organize the interaction between services through a message queue.

For example, if in the service, when registering a user, we need to send him a welcome letter, create a request in CRM for a client manager, etc., then it’s logical not to make an external service call, but to put the message “user 123 is registered” in the registration service ”, And all the necessary services will read this message and take the necessary action. At the same time, changing the business logic will not require changing the registration service.

Most often, not just messages are thrown into the queue, but events. Since the queue is only a transport protocol, the same restrictions apply to the data interface as to the regular synchronous interface. Therefore, in order to avoid problems with changing the interface and subsequent edits in other services, it is best to make events as problem-oriented as possible. Still such events are often called domain events. At the same time, the use of the event model usually does not significantly affect the boundaries at which (micro) services fight.

Since domain events are practically 1 in 1 translated into synchronous API methods, sometimes they even suggest using an event stream instead of an event stream instead of an API call (Event Sourcing). By the flow of events you can always restore the state of objects, but also have a free history. In fact, usually this approach is not very flexible - you need to support all events, and it is often easier to keep a story alongside the usual API.

Microservices and performance. Cqrs

In principle, the problem area implies changes in the code associated not only with functional business requirements, but also with non-functional ones - for example, performance. If there are two pieces of code with different performance requirements, then this means that these two pieces of code may make sense to separate. And they are usually divided into separate services in order to be able to use different languages and technologies that are more suitable for the task.

For example, there is a cpu-bound calculator method in a service written in PHP that performs complex calculations. With an increase in load and amount of data, he stopped coping. And of course, as one of the options, it makes sense to do calculations not in php code, but in a separate high-performance system daemon.

As one example of the division of services by the principle of performance - the separation of services into read and modify (CQRS). Such a separation is often proposed because the performance requirements for reading services and writing are different. The read load is often an order of magnitude higher than the write load. And the requirements for response speed of read requests are much higher than for write.

The client spends 99% of the time in the search for goods, and only 1% of the time in the order process. For a client in a search state, display speed is important, and features related to filters, various options for displaying goods, etc. Therefore, it makes sense to highlight a separate service that is responsible for the search, filtering and display of goods. Such a service will most likely work on some kind of ELK, a document-oriented database with denormalized data.

Obviously, a naive separation of reading and modifying services may not always be good.

Example. For a manager who works with filling the product range, the main features will be the ability to conveniently add goods, delete, change and view. There is not much load, if we separate reading and changing into separate services, we will not get anything from such separation, except for problems when it is necessary to make coordinated changes in the services.

All Articles