Data Engineer - The Sexiest Profession of the 21st Century

Recently, in a conversation with the HRs of one large company, “Every data engineer who comes to us for an interview wants to become a data scientist.” This then greatly surprised me and was very disappointed for the date of the engineer, to be honest.



Here we (and not only) have already published several materials about data engineers and their value for business - for example, an interview with Nikolai Markov or “ 4 reasons to become a data engineer ”, but it was a long time ago. Time passes, material accumulates, the world develops, so there is something to tell.



Perhaps we should first briefly recall what the task of the engineer’s date is (plus or minus, of course, since each company can add something of its own, something from the above can be performed by other employees):



- Building stable pipelines that make data available to all users within the company;

- collection, cleaning and preprocessing of data as part of an ETL or ELT process;

- Work with DBA to create data warehouses;

- the use of frameworks and microservices for data service;

- monitoring data quality;

- output models in the prod.



Consider the labor markets of the USA, Germany and talk with local, Moscow, data engineers.



USA



At the end of last year , Dice studied the data of the Burning Glass's Nova platform, which analyzes open positions in the US labor market, and it turned out that relative to 2017, the demand for data engineer increased by 96.7%, while for a data scientist - 51, one%.



image



Demand for the date of engineers, of course, is huge. Now, for example, Indeed has opened more than 131,000 vacancies in the United States alone, most of all in Seattle, San Francisco and New York. Salary run-up generally follows the chart below . At the same time, it is important to note that salaries for date engineers in the USA are not inferior to salaries for Scientists (there is an opinion that engineers get less), but they depend, for example, on the city: in New York, engineers can get 132 thousand dollars, and San Francisco and 151 thousand dollars.



image



And for comparison - in the same place, Indeed, only 12,000+ vacancies of data scientists were published with the same paycheck. In general, the numbers speak for themselves: the data engineer has become the most sought after and “sexiest” profession!



Germany



Unsatisfied demand for date engineers is in Germany. In September, we conducted a corporate data engineering program for XING. When there are not enough personnel in the market, and the company needs engineers very much, then one of the solutions is to develop and train your current employees. Martin Shtoev, Director of Engineering at XING, said that in recent years they have gradually retrained more than a dozen developers and made their date engineers. This was done mainly because of the need, but also because they bring important domain knowledge to the projects.



The first date engineers worked closely with the central DS team in the company on certain projects and this was a very organic addition. With the advent of an increasing number of engineers "grown" inside the data, difficulties were discovered: the central team invested a lot of time in training, they had to prepare several data engineers for months to work in different teams. And by the time the previous group graduated, the next one was already approaching, but it was not possible to unite them all, because everyone worked on different projects, I had to train in small groups. It was also impossible to simply include the new date engineers in the project teams, because for many teams this was the first date engineer.



Most of the developers who decided to switch to engineering date were either June or senior, and they all wanted to study, so XING only had to provide books, tutorials and organize workshops. Workshops were conducted by both employees and external providers, and were devoted to such core technologies as: Hadoop, Scala and Kafka. Over time, date engineers were no longer new to teams, and less and less the central team worked with grown-up engineers on long-term projects. According to Martin, on average, it takes 6 months for a retrained engineer to start working independently on more complex tasks, and then after another 6 months the company changes its employee position. Of course, the smaller the initial gap between the skills and knowledge of the employee and the requirements for a date engineer, the faster the training process.



During these several years of training internal employees, XING made several observations:

- backend developers who have already worked with pipelines usually grab new knowledge faster than, for example, front-end developers;

- with less success, date Scientists retrain to date engineers;

- Attempts to retrain the developers in data scientists were also not successful, unless the person had a good mathematical base and knowledge of scientific methods or a very great desire to independently learn all this, because the gap in the required knowledge is too wide.



It seems to me that these are very important observations that can save a lot of money and time for any employer, because there are not enough engineers on the Russian market, and you still have to train your employees. And those data engineers who dream of becoming data scientists should consider that it doesn’t work like that, because the date of the engineer and the date of the scientist are two people with different mindsets.



Just recently, Alexey Grigoriev threw a Darwin Recruitment Berlin market report into Ocare #career. And here in the quote that opens the section on data engineering, it speaks of the development of engineering data in an already formed market, and the fact of a lack of local data engineers is confirmed: “more and more companies in Germany are hiring employees from other countries, and these experienced data engineers bring excellent tools and technologies. " The agency says about 51% - that is how many candidates from other countries are interviewed with their clients. So date engineering is a good profession in demand, with which you can get and relocation. That is exactly how Newprolab graduate Nikolai Rekubratsky moved to Hamburg to the position of engineer date, with whom we did an interview last year. According to the level of salaries in Berlin, the agency brings a fork of 55-70 thousand euros per year, but perhaps in different cities of Germany there is a difference in salary and in additional corporate and social benefits (at least in Hamburg we were told a lot of good things about this )



Russia



Well, in Russia, the engineer started writing the date about the profession only in the year 2017 (although at that time engineers already had live dates, and at conferences one could hear good reports on date engineering), but so far everyone has heard the data scientist, and one gets the impression that everyone is only dreaming of "the sexiest profession of the 21st century." The Habr does not help either, which refused me to make a data engineering hub in September 2017: “To initiate the consideration of an application for a new hub, you must specify links to at least 10 materials that are already posted on the Habr's pages and can be attributed to the proposed hub” I believe that you can already apply again, the condition is fulfilled.



But after all, from every iron only ds is said, therefore we hear: “Every data engineer who comes to our interview wants to become a data scientist.”



image



Just at the moment when it was said, our Data Engineer 5.0 program was coming to an end, and I decided to put this phrase in a group chat and get the opinion of our participants. And so, what discussion unfolded and what considerations were expressed:



“We are now recruiting a team for us [date of engineers], 30 people have already interviewed, and almost everyone without exception wants to be Scientists ... It really becomes insulting for our direction :("



“Everyone wants a lot of money and at the same time minimize strain. And such experts believe that DS-AMs pay more than DE, although this is not so. The problem here is that DE needs to learn to build cool things using different technologies, sometimes to cut his own if there is nothing suitable, and for DS almost all the tools have already been done, and for the most part they are the same for solving different problems (Python / R + libraries with various implementations of ML and neural networks). In general, the entry threshold is now lower for DS than for DE, and this type of work is much easier due to the availability of ready-made tools. I think this is a matter of people's psychology: everyone wants to get recognition to be in sight, and in the Big Data stack, DS does all the magic. DE act as assistants ... Here are a couple of analogies that I noticed:

1) For example, computer games - no one wants to be a support, but everyone wants to be Kerry / DD /.

2) Or football - everyone wants to be forwards and few want to be defenders.

Communicating with DE often, I heard things like this: nobody appreciates me on the project, treats me like a loader, takes the data here, brings it here ... One thing is good, now there is a process of rethinking and many DE teams are beginning to respect and love. For example, here at my place everything is cool in this regard, as DE everyone treats me well, I help DS, they help me, and so we live in symbiosis. ”



“I would have looked like DS lived without DE) In general, Big Data cannot live without DE, and without DS, at the very least, it’s normal. Just don’t throw me rotten tomatoes. ”



But there was an alternative opinion: “DE is definitely not the initial stage. But, unfortunately, everyone knows only about DS and a lot of materials and courses have been published about this. This is what people learn. And there are few courses for DE. We need to study everything and everything, depending on the projects. Unfortunately, the world is moving towards containers. And jarn is likely to be often used on the cuber. And all because of the DS. It’s easy for them to pick up the container and go. This is me to the fact that everything is going to facilitate the process of integration and rolling out to the products, which leads to a decrease in the DE zone. # dying out »



“What I see: there is a logical race for strategic positioning. Advanced dss shave in engineering no worse than de and can / want to pretend to roll out in PROM to reduce t2m [time to market], but they themselves can attack the machine zone through the automat and enter the ds zone. If you choose from 2, the second is closer to me, of course. I think that on the whole, those who try to go beyond their functional framework will win, because the desire for functional gradation perfectly characterizes the process approach, but in the long run what wins is seamless. ”



One of the program’s speakers also joined the discussion: “On the contrary, as a DS, I am often drawn to engineering tasks. For me, switching to DE is a significant decrease in grade. There was a period when I tried to get to DE at Amazon, I even flew to Luxembourg for a 6-hour interview, but it was refused with the wording “you are not DE, you are DS” ”.



But I would draw your attention to this wording with refusal, it once again confirms that these are different people. Therefore, if among your strengths is a systematic approach, an engineering mindset, the ability to understand new technologies, the ability to understand documentation, write good code, design stable solutions, then develop further and become a competent date engineer, look for a team and company in Russia or abroad, where you can maximize your potential as a date engineer, and will not try to become someone else.



And if among you there are date engineers who have something to write about and talk about, let's be friends and move date engineering :)



All Articles