Rust for web developer - quick start and fast flight

Hello! Today I want to share my experience of learning a language and the quick implementation of a high-load network project using the now so popular and popular non-blocking asynchronous network connections in a new, beautiful, elegant and very effective Rust language.

I will put special emphasis in the post on a quick and clear explanation of the capabilities of the language and platform for specialists who have extensive experience in web development, because I myself am. There is a misconception that the entry curve in Rust is very, very steep. But I will show that this is far from the case. Pour coffee and drove!



A Brief History of Programming Values



So that the material lies well in the head and heart, it’s nice to briefly recall what people wanted to do in programming over the past 50 years and what they ended up with. No offense, only personal subjective opinion and holivar, supported by 20 years of development experience.



Low Level Languages: C, C ++



It is clear that you can write the program immediately in the form of numbers on machine codes, and many did it on the ZX Spectrum, BK0010-01 and on PC - the code turns out to be very fast :-) But we are people, a lot of information does not fit in our heads, we are distracted and therefore even the invention of assembler did not help much - code at such a low level is very rarely written and very accurately and most likely, if you are not developing drivers, microcontrollers or cunning embedded systems, this will not come in handy in life.





At the beginning of the 70s, Bell Labs invented the C language, which took root thanks to the laconic syntax and very "cheap" abstractions, almost becoming a "portable assembler". It is clear that if you take a tonsure, write for 10 years on C nights, don’t eat meat, pray and don’t get distracted by social networks and the fair sex, you can write very useful and fast programs, which GNU eloquently testifies to, excellent productive games, beloved, but non-alternative in quality of Windows, and many more examples can be given.

But the flip side of the coin is constantly making itself felt - regularly opening holes in security (an entire industry for "holes in software" has been created), caused by holes in the concept of the C language itself - the compiler is like an irresponsible bull, with incredible power, intense orgasm and short memory. Any negligence - and you can not just drop the program (dereferencing a null pointer, double free the pointer, go out of the array), but irreparably ruin the data and not notice it for a long time until the clients start calling and when it's too late (“undefined behavior”, different from compiler to compiler).



Björn Straustrup only further confused the situation in the early eighties, adding OOP capabilities to C. Despite its great popularity, C ++, in general, is seen as a series of programming experiments, as such, with different success outcomes, including lethal. Sometimes it even seems that there was no sense in C ++ from the very beginning, or it was gradually lost, giving rise to a heap of objectively complicated and conflicting concepts, which are becoming more and more with each new standard. Despite the excellent goal of “zero-cost abstractions”, allowing you to get fast code, to create a reliable solution, like in C, the following conditions are required:





It is clear that compliance with these requirements, especially in the context of an increasing business need for a “code that works without sudden surprises,” is very expensive. The code in such projects is written for a long time, it needs to be tested for a long time and carefully, but, sometimes, without C / C ++, before the invention of Rust, it was really difficult to do.



Once again, a summary of C / C ++ - we have a powerful, but "irresponsible" compiler with "current abstractions," which helps the developer very little. As a result, all problems are passed on to the shoulders of the programmer. If at least one programmer in the team is not experienced, not very careful and does not know all the subtleties of the compiler (in fact, no one knows all the subtleties and users find them later) - wait for the trouble. But on the other hand, the program works quickly and probably correctly :-) This, of course, spawned a whole market of “crutches” - static analyzers, which, as it turned out, the Customer should pay for. The question arises - couldn’t it be possible to write a more strict and secure compiler that helps the developer and gives birth to programs without surprises and low-level security holes?



Java, C #, Kotlin



The situation with frankly weak control of “indefinite behavior” and very high requirements for developers in C / C ++ gave rise to a desire to create a safe development environment, including for the Internet, accessible to most comers. So in the late 90s Java appeared.

In principle, now anyone with a different level of training could write anything and anything and IT WORKED and don’t put into the program - there were no low-level "holes" in security and spontaneous crashes (almost, but they were already caused by bugs in a virtual machine and fixed centrally). There were only logical “holes” or hidden slow code (born of ignorance of algorithms, the concept of algorithmic cost and slowing down as the data volume increased), which was not so scary anymore and allows you to make programs quickly, and if necessary, rewrite small chunks with a qualified C / C team C ++.

The interesting and valuable things that brought to the Java world are the following:





Of course, the emergence of such a friendly platform made it possible to write many useful and useless programs, which the business immediately took advantage of. And support for backward compatibility in Java for 15 years or more has made technology so popular in the enterprise world.



Yes, I also have analogies with non-alcoholic beer and rubber women.



However, far from immediately, the following problems got out in Java:





After reading, it should be clear that everything cannot be written in Java, it will be written quite slowly, then it will regularly slow down and hang for a second under load, sometimes “eat” a lot of RAM (usually at least half of the RAM on the server is eaten), productive games You can’t do it either, but some pieces of business logic that are not particularly demanding on performance and RAM, especially multi-threaded ones, are quite possible, useful, and therefore we see such popularity of the technology in the enterprise world. And we ourselves regularly and actively use Java for the services of the company.



C #, Scala, Kotlin



“And when will it be about Rust?” - wait a minute, you need a little more preparation to eat this sweet persimmon. If you don’t talk about the features of other technologies, you won’t understand why Rust appeared and why it is so popular and in demand .



So, trying to make Java better and more progressive, Microsoft, in the person of the author TurboPascal / Delphi, at the beginning of the zero came up with C # and the concept of .NET. Objectively, according to many reputable experts, and in the smoking rooms of developers, “C #” is sexier than Java, although, of course, so far, despite Mono, it is strongly attached to Microsoft with all the flowing and flowing :-)



Scala is undoubtedly a big step forward, as the language, of course, turned out to be scientifically abstruse and sophisticated, taking a lot of useful and useless things from the world of functional programming. But something is somehow not very clear with popularity. However, Apache Spark is really good and popular, nothing to say.



Kotlin is popular because makes Java more efficient and easier for novice developers, especially on mobile platforms that don’t have time to seriously study programming :-)



But the main problem in C # (on the .NET platform), as well as in Scala, Kotlin and other JVM languages, remained. Garbage collector, Carl, creating a noticeable load on the server and stopping the execution of the code for seconds with loads and gluttonous requirements for RAM! And how many more languages ​​with a garbage collector and jit-compilation will not come up with, these problems will remain and there is no hope yet, even in theory.



Scripting



"But what about PHP?". Yes, now let's talk about scripting. It would seem, why is he? Around the end of the 80s, it became obvious that if you need to quickly solve a problem using code, not necessarily super-fast, and it’s safe, without undefined behavior and low-level holes, you can write a script. Scripts were written before, in bash, perl, awk, but it turned out that in python you can write large, large scripts, especially scientific ones, and for years!



Lua found its niche in gamedev and machine learning ( Torch ), JavaScript - in web development both on the browser side and on the server side (they still know that the authorization in the "npm" Node.js infrastructure is rewritten from Node.js and Golang on rust?). Python - in machine learning and data analysis, as well as in system scripting. And PHP - perfectly copes with server tasks for web development.





The advantages of scripting are obvious:





However, you can not know the potential disadvantages of scripting:





However, since in most business applications, including site management systems, CRM, etc. Since most of the code execution time is spent on database queries, the advantage of jit Java / C # is deeply leveled out by the speed of writing solutions in PHP / python / JavaScript, and personally, for creating web applications, I will select 10-20 lines in PHP than 10,000 strings and a bunch of giblets in Java / Spring. And PHP7 somehow overclocked so that it runs faster than python3 ;-)



What conclusion can be made here? It is not necessary to solve all problems, as is now popular, only by scripting - in some cases it is reasonable to take another, more suitable tool, if there are good reasons for this:





Often in the company we practice this approach:



Therefore, it is better to start, as a rule, with scripting, start a battle, and then, if you need it directly, you need to pick up other tools and this goes, usually very smoothly and without risks.



Functional Programming, Lisp, Haskell, F #



Many, for the whole career of a developer, never come here, but in vain. It is extremely useful to understand why the FP (functional programming) appeared and why in some areas it is so popular.



I’ll explain it simply. There is such an unsolvable problem in mathematics as the "halting problem" . If you put it in very simple words and quite non-strict, but it is clear, then you can not come up with an algorithm proving that the program will work without bugs. Why? Because initially, people started programming aggressively and imperatively, using:



And they began to make mistakes. We see this now, observing a huge number of bugs both on the web, and in desktop and mobile applications. And no matter how you cover the code with autotests, bugs still continue to leak, scatter across the floor and giggle.

To stop this nightmare of the onset of buggy and merciless software, in the late 50s, a direction of functional programming and the Lisp language appeared . Now this family of languages ​​represents, more or less adequately, Haskell .



Despite the fact that a huge number of possible errors in the code are really eliminated due to:





Haskell frankly did not take off due to aggressive “show-offs” - in order to understand his terminology well, you must at least be a candidate of science with a specialization in one of the leading theories in modern mathematics: category theory . And in Haskell, they were obviously wise with “laziness”, either the execution graph filled up the memory, or the execution began, and the memory ended later, which often painfully affects its combat use. That is why soldiers in a battle on the territory of the enemy throw Haskell and take a Kalashnikov assault rifle.



And, of course, they completely forgot about the garbage collector - in Haskell, unfortunately, it also exists, which puts it in terms of efficiency and productivity on a par with competitors like Java / C #.



But this does not mean that the language does not need to be learned. Haskell develops the programmer, first of all, the brains that are so necessary for writing clear, effective and easily maintained programs over the years. It is well known that besides scripting it is extremely useful to know at least one of the “real” programming languages ​​- and Haskell is a very worthy candidate here.



New C / C ++ - Golang, D



System programming, even at the beginning of the 2000s, implemented itself mainly through C / C ++ (perhaps, one cannot fail to mention the famous Forth in this context). And the opinion of Linus Torvalds about C ++ both then and now has not lost its relevance, despite the attempts of the great and terrible Andrei Alexandrescu to change the situation in D (with the garbage collector - well, how so, they stepped on this rake again).



However, apparently people are tired of constantly doing systemic sado-masochism and writing in C / C ++ in knightly armor with the RAII paradigm and a bunch of restrictions and prohibitions that were originally available in the language and libraries. And at the end of 2009, the Golang language appears in the bowels of Google.

To be honest and frank, Golang didn’t bring anything new to the world of system programming, but rather, a step back and a flip sideways. Golang seems like Java / C # stripped down in all directions to a minimum (OOP is cut and simplified beyond recognition) and yes, with the garbage collector ... The only consolations coming from Golang:



And a very muddy situation with the package manager. Even in Node.js and python it is.



Golang may have been created as part of the competition between large corporations, Google vs Sun / Oracle, for the hearts of developers, but we will probably never know this :-) Obviously, the creation of “greatly simplified Java / C #” will attract and already attract crowds of fans of solving system problems, but whether the industry will benefit from this - we have yet to see. Although Docker on Golang has already appeared and turned the world in the right direction. And the objective benefit from Golang is a low-entry language, and if you don’t have time to learn Java / C #, but you need to solve a system problem, then that’s it.





Of course, Swift looks interesting against this background, with a “more advanced" garbage collector and fresh ideas. But not always develop under macOS.



There is a way out - Rust!



We are adults, we can read between the lines and understand that over the past 40-50 years, attempts have been constantly made to create a fast, safe, rigorous and systemic or near-system programming language, preferably with zero-cost abstractions, but something it doesn’t work out very well :-) Either the language is relatively strict, but frankly sometimes slowing down (Java / C #), then it’s very fast, but hopelessly full of holes (C / C ++), then it’s still some, but with a chain and a weight on left foot - garbage collector. Well, is it possible to think carefully and write a compiler that MAY ALL of the above be SIMULTANEOUS?



It turns out - you can. A miracle, and probably the best thing that has happened in the last 50 years in the field of creating programming tools, happened in mid-2010 at Mozilla Research. Colleagues created a compiler that has the following features:





I am sure you felt a shock, as I once did. How can this be? It's impossible! How they connected the unconnectable and shoved the unapproachable, which could not be done over the past 50 years. It turned out, perhaps, that they “simply” generalized the errors of the previous languages, wrote a really competent and pessimistic compiler, and introduced into the everyday life some unique concepts that had not been encountered before in programming.



Rust can be thought of as “lightweight” and, of course, very rigorous and powerful in its essence Haskell for system programming.

Rust also turned out to be a safer language than, attention, Java :-) - there are no Nulls, algebraic data types and pattern-matching on them, strict and powerful generics and traits and really safe multi-threaded programming. Nobody expected this. And the code turns out faster, and eats less resources, and fewer errors in the code because of a very strict, “Haskell-like” compiler and NO GC.



Now briefly about everything in order.



How did you manage to refuse the garbage collector?



And like this! The memory in the compiler began to be presented as a resource, “ownership” of which can be one and only one variable. If the variable goes out of the "ala of scope" (more precisely, "lifetime", details in the Rust book, but everything is intuitively simple), the compiler inserts a call to the destructor (there may be no destructor, then nothing is called).

If the variable is passed to the function / method, then ownership is transferred there and at the end of the function the destructor will be called by the compiler. If a variable is returned from a function / method, then ownership also returns to a higher level and the destructor is called at a higher level.

In such a simple way, once and for all, we got rid of the terrible turbidity in C ++ related to copy constructors and move semantics. If the description seemed complicated, try writing code with destructors in rust and you will see that everything is very strict and logical.



How did you achieve zero-cost abstractions?



There is no heaped OOP and inheritance in the language, but there are powerful and flexible traits, as well as encapsulation tools. In this language, of course, is similar to Golang. Everything revolves around structures and methods on them, but, due to algebraic data types and pattern-matching, the code is accurate and rigorous (which they forgot to add to Golang, unfortunately). You can create a read link and a write link to the structure / data. But at the same time, only one type can be created: one write link or many read links. In language terminology, this is called borrowing . This reduces the need to copy objects, as well as avoids data race (due to "read-write locks" implemented by borrowing). All this, attention, is strictly checked by the compiler and if the code does not compile, then you need to correct it and prove to the pessimistic compiler that there are no more errors.

To work with strings and other operations, low-level slice types are used, which of course serves speed. However, there is a problem - the lines are arranged “a little complicated” and, it seems, they themselves appeared in the language outside the will of the developers as a result of a mutation of the compiler; but they work, work quickly and predictably, and that's it for now with the lines.



Safe programming and functionality



The language is very balanced with the possibilities of functional programming:





In general, it is clear that all the complex and contradictory concepts of system and multi-threaded programming (freeing up memory, data racing) were transferred to a system of unique affine types with strict compiler guarantees and it ... works. A strict type system and harsh logic algebra in combination with a smart compiler give you guarantees of safe / multi-threaded code that works at C / C ++ speed and consumes just as few resources. And initially no one believed in this.



Multithreaded programming



Due to the possession and borrowing capabilities that are unique to the language and built into the compiler, there are no problems with the procedure for setting and removing locks, and multi-threaded programming is somehow easy and without surprises. If something is wrong, the code simply does not compile. I will not describe the details of the Sync / Send traits, it needs to be seen live, but nothing complicated and mystical happens: implement the logic, if the compiler swears, fix it, that’s all.



If you want to implement a multi-threaded network service with non-blocking connections that processes tens of thousands of sockets simultaneously almost invisibly to the processor, you can take a ready-made library and realize the most daring thoughts about working with futures in functionally strict style in 1-2 hours. We get the same thing that is inside Node.js and python with async / await, but with a guarantee of strict algebraic types, it works much faster and with much less resource consumption. And if the result is the same, why pay more?



Built-in unit and integration tests



Surprisingly, they also thought about this and the ability to write unit and integration tests is built into the development environment and tools by default. You immediately write tests to the code and they are executed.



Convenient package manager - cargo



Citing an analogy from Java, “Maven under the name cargo ” is already built into the development environment, dependencies are described and the magic goes on: the sources are downloaded, compiled and everything works. It’s convenient, by the way, to climb the source, sometimes it helps a lot.



Heap and stack



By default, memory for structures and objects is allocated on the stack, however it is very easy to allocate it and then automatically free it in heap. The memory, as already described above, is automatically freed when its owner (referring variable) leaves the "ala scope" (more precisely, "lifetime"). And if you really need it, smart pointers of different types are also there (hello Swift). If what has been written above seemed complicated, then this is not so - write the code that allocates memory in heap and print a message about deallocation of memory in the destructor and everything will become clear and understandable. There are no tricks - the memory will always be freed and it is not the programmer who checks it, but the compiler gives a guarantee.



What can be very incomprehensible



In fact, it is clear that it was really possible to create a smart, balanced, very strict and fast language for safe system programming. Yes, he is still young (stabilized for 2 years) and there is more to the libraries, they say, although I found everything I need, from high - speed non - blocking processing of network sockets , parsing command line arguments and connecting to AmazonWebServices to in-depth cryptography with tls / ssl, which had to be modified a bit for the task and this did not cause any problems.

However, some things, which, fortunately, are very rarely found in real tasks, will still require in-depth study. To such features, I would refer the concept of "ala visibility areas" (more precisely, "lifetimes")and their use.

Still, while apparently not very directly convenient place is a small number of development environments. IntelliJ behaves pretty well with the module for rust , but sometimes it does not cope with complex types of hints. However, you can write in Notepad ++ - a smart compiler with guarantees will warn you about errors in the code.



General Rust Development Principles









Understand once again - in order to master the technology, you need to start writing code useful for the company on it. The Rust compiler is so smart that it gives guarantees (in fact, it is very important, the C ++ compiler does not know how to do this) and does not compile dangerous / memory-damaged code, so experiment as much as you want and you will get the code fast and safe and even better you will start programming: - )



Project Implementation Details



I will reveal a little detail of the project. Amazon SQS spills hundreds of data packets per second. The queue is read, parsed locally by the workers, each message is processed by the broker and redirected to another, external server. There are several external servers. Initially, the solution was implemented through scripting: the script, taking up a process in the operating system, read messages, processed and sent over the network. On several powerful iron servers (8 cores, 16 GB of RAM), hundreds of scripts were launched (on each!), Simultaneously reading from SQS, processing and sending data. Simple, reliable, but iron intake began to bother. The cost of iron was constantly increasing.



Rust used mostly standard libraries and modules from cargo:





Unfortunately, it was not without using “unsafe” blocks and “std :: mem :: transmute” - the standard library could not find tools for parsing binary data into trees.



The main, if you can call it that, "plugging" happened in compilation - libraries were not compiled on CentOS6 because of the "outdated assembler" in binutils, but there were no problems on CentOS7.



The general impression is that development on Rust resembles, rather, “strict scripting” rather than system programming, not much longer than scripting or web development, both in terms of resources and testing. In this case, strict static compilation, the absence of a garbage collector and algebraically data types.



The overall feeling is very positive. Still, instead of several iron servers (8 cores, 16 GB of RAM), the problem began to be solved by one process (with dozens of threads) that eats no more than 5 GB of RAM and creates a not very noticeable load on the cores, with traffic in the region of 0.5-1 gigabits .



Conclusion



Well, that ended a long, but, I hope, inspiring and useful post about effective technology. Now you know another tool and you can more safely use it if necessary. We reviewed the history of the development of programming languages, their capabilities and features, and may have made or will draw the right conclusions. Good luck with your projects and good, no, great mood!



PS:

* - Yes, I almost forgot. Of course, we need to talk about the unsafe block. In this block you can:

Those.in the “unsafe” block, you can not engage in arbitrary debauchery, available in C - but only strictly by certain types of dangerous activities. Therefore, you can and should sleep peacefully :-)



All Articles