Simple and in C ++. Userver Basics - A framework for writing asynchronous microservices

Yandex.Taxi adheres to microservice architecture. With the increase in the number of microservices, we noticed that developers spend a lot of time on boilerplate and typical problems, while solutions do not always work out optimal.



We decided to create our own framework, with C ++ 17 and coroutines. This is how a typical microservice code now looks:



Response View::Handle(Request&& request, const Dependencies& dependencies) { auto cluster = dependencies.pg->GetCluster(); auto trx = cluster->Begin(storages::postgres::ClusterHostType::kMaster); const char* statement = "SELECT ok, baz FROM some WHERE id = $1 LIMIT 1"; auto row = psql::Execute(trx, statement, request.id)[0]; if (!row["ok"].As<bool>()) { LOG_DEBUG() << request.id << " is not OK of " << GetSomeInfoFromDb(); return Response400(); } psql::Execute(trx, queries::kUpdateRules, request.foo, request.bar); trx.Commit(); return Response200{row["baz"].As<std::string>()}; }
      
      





And here is why it is extremely effective and fast - we will tell under the cut.



Userver - Asynchronous



Our team consists not only of seasoned C ++ developers: there are trainees, junior developers, and even people who are not particularly used to writing in C ++. Therefore, the design of userver is based on ease of use. However, with our data volumes and load, we also cannot afford to waste iron resources inefficiently.



Microservices are characterized by the expectation of input / output: often the response of a microservice is formed from several responses from other microservices and databases. The problem of efficient I / O wait is solved through asynchronous methods and callbacks: with asynchronous operations there is no need to produce execution threads, and accordingly, there is no big overhead for switching flows ... that's just the code is quite difficult to write and maintain:



 void View::Handle(Request&& request, const Dependencies& dependencies, Response response) { auto cluster = dependencies.pg->GetCluster(); cluster->Begin(storages::postgres::ClusterHostType::kMaster, [request = std::move(request), response](auto& trx) { const char* statement = "SELECT ok, baz FROM some WHERE id = $1 LIMIT 1"; psql::Execute(trx, statement, request.id, [request = std::move(request), response, trx = std::move(trx)](auto& res) { auto row = res[0]; if (!row["ok"].As<bool>()) { if (LogDebug()) { GetSomeInfoFromDb([id = request.id](auto info) { LOG_DEBUG() << id << " is not OK of " << info; }); } *response = Response400{}; } psql::Execute(trx, queries::kUpdateRules, request.foo, request.bar, [row = std::move(row), trx = std::move(trx), response]() { trx.Commit([row = std::move(row), response]() { *response = Response200{row["baz"].As<std::string>()}; }); }); }); }); }
      
      





And here stackfull-coroutines come to the rescue. The user of the framework thinks that he writes the usual synchronous code:



  auto row = psql::Execute(trx, queries::kGetRules, request.id)[0];
      
      





However, approximately the following occurs under the hood:



  1. TCP packets are generated and sent with a request to the database;
  2. execution of coroutine, in which the View :: Handle function is currently running, is suspended;
  3. we say to the kernel of the OS: β€œβ€œ Put the suspended coroutine in the queue of tasks ready for execution as soon as enough TCP packets come from the database ”;
  4. without waiting for the previous step, we take and launch another coroutine ready for execution from the queue.


In other words, the function from the first example works asynchronously and is close to such code using C ++ 20 Coroutines:



 Response View::Handle(Request&& request, const Dependencies& dependencies) { auto cluster = dependencies.pg->GetCluster(); auto trx = co_await cluster->Begin(storages::postgres::ClusterHostType::kMaster); const char* statement = "SELECT ok, baz FROM some WHERE id = $1 LIMIT 1"; auto row = co_await psql::Execute(trx, statement, request.id)[0]; if (!row["ok"].As<bool>()) { LOG_DEBUG() << request.id << " is not OK of " << co_await GetSomeInfoFromDb(); co_return Response400{"NOT_OK", "Please provide different ID"}; } co_await psql::Execute(trx, queries::kUpdateRules, request.foo, request.bar); co_await trx.Commit(); co_return Response200{row["baz"].As<std::string>()}; }
      
      





That's just the user does not need to think about co_await and co_return, everything works "on its own".



In our framework, switching between coroutines is faster than calling std :: this_thread :: yield (). The entire microservice costs a very small number of threads.



At the moment, userver contains asynchronous drivers:

* for OS sockets;

* http and https (client and server);

* PostgreSQL;

* MongoDB;

* Redis;

* work with files;

* timers;

* primitives to synchronize and launch new coroutines.



The above asynchronous approach to solving I / O-bound tasks should be familiar to Go developers. But, unlike Go, we do not get overhead for memory and CPU from the garbage collector. Developers can use a richer language, with various containers and high-performance libraries, without suffering from a lack of consistency, RAII or templates.



Userver - components



Of course, a full-fledged framework is not only coroutines. The tasks of developers in Taxi are extremely diverse, and each of them requires its own set of tools to solve. Therefore, userver has everything you need:

* for logging;

* caching;

* work with various data formats;

* work with configs and updating configs without restarting the service;

* distributed locks;

* testing;

* authorization and authentication;

* create and send metrics;

* writing REST handlers;

+ code generation and dependency support (moved to a separate part of the framework).



Userver - code generation



Let's go back to the first line of our example and see what lies behind Response and Request:



 Response Handle(Request&& request, const Dependencies& dependencies);
      
      





With userver you can write any microservice, but there is a requirement for our microservices that their APIs must be documented (described through swagger schemes).



For example, for the Handle example, the swagger diagram might look like this:



 paths: /some/sample/{bar}: post: description: |     Habr. summary: | ,  -   . parameters: - in: query name: id type: string required: true - in: header name: foo type: string enum: - foo1 - foo2 required: true - in: path name: bar type: string required: true responses: '200': description: OK schema: type: object additionalProperties: false required: - baz properties: baz: type: string '400': $ref: '#/responses/ResponseCommonError'
      
      





Well, since the developer already has a scheme with a description of requests and answers, then why not generate these requests and answers based on it? At the same time, links to protobuf / flatbuffer / ... files can also be indicated in the diagram - the code generation from the request itself will get everything, validate the input data according to the diagram and decompose it into the fields of the Response structure. The user only needs to write functionality in the Handle method, without being distracted by the boilerplate with request parsing and serialization of the response.



At the same time, code generation also works for service customers. You can specify that your service needs a client that works according to such a scheme and get a class ready for use for creating asynchronous requests:



 Request req; req.id = id; req.foo = foo; req.bar = bar; dependencies.sample_client.SomeSampleBarPost(req);
      
      





This approach has another plus: always up-to-date documentation. If a developer suddenly tries to use parameters that are not in the documentation, he will get a compilation error.



Userver - logging



We love to write logs. If you log only the most important information, it will run several terabytes of logs per hour. Therefore, it is not surprising that our logging has its own tricks:

* it is asynchronous (of course :-));

* we can log bypassing the slow std :: locale and std :: ostream;

* we can switch the logging level on the fly (without restarting the service);

* we do not execute user code if it is needed only for logging.



For example, during the normal operation of the microservice, the logging level will be set to INFO, and the whole expression



  LOG_DEBUG() << request.id << " is not OK of " << GetSomeInfoFromDb();
      
      





will not be calculated. Including the call to the resource-intensive function GetSomeInfoFromDb () will not occur.



If suddenly the service starts to β€œfool”, the developer can always tell the working service: β€œLog in in DEBUG mode”. And in this case the entries β€œis not OK of” will begin to appear in the logs, the GetSomeInfoFromDb () function will be executed.



Instead of totals



In one article it is impossible to tell at once about all the features and tricks. Therefore, we started with a short introduction. Write in the comments about what things from userver you would be interested to know and read about.



Now we are considering whether to post the framework in open source. If we decide that yes, preparing the framework for opening the source will require a lot of effort.



All Articles