How I left basic science in a startup

Today, exactly six months after twenty years in science, I went to a startup that develops software for creating synthetic data, in which I integrate machine learning systems into the final product. Since there are people here who are still deciding to take such a step, I will describe my path, or rather, a change in the direction of my path.







Academy



I am a professional physicist and received a Ph.D. at the Physics Department of Moscow State University. He worked both in theoretical physics and in experimental. Articles in Nature and PRL, more than 700 citations, and Hirsch index 14. These details are not to measure something with someone, but to understand the big picture. If we take contracts lasting more than a year, then at such long positions he worked in four countries of Western Europe. Everything would be fine, but recently, science has begun to turn into a business. People at universities only think about getting a permanent position. The constant search for money for research - writing grant applications began to take longer than doing science directly. If you do not have an ERC (European Research Council) project, then you will not be interested in any more or less normal European university, even if you have a bunch of excellent publications and projects. Moreover, all efforts in the end can turn into just nothing. What are the ultimate chances of getting a position? From 0.5 to 2% when you get to the right place at the right time, that is, it is a matter of luck (hello to Daniel Kahneman and Nassim Taleb). Ten years of research may turn out to be unnecessary, not "fashionable." Such activities began to bring more and more disappointments. And if the work ceases to bring pleasure, then everything around it ceases to please. And statistics indicate that the situation is unlikely to improve. So, the decision to end his academic career was made. The question of further activities for me was obvious - programming. But quickly the tale affects, but not quickly the thing is done ...







Programming



Why programming, not datasines? Everything is simple here - I had at least some experience in programming and software development. Yes, it was limited development experience for personal and internal academic use, but my software works and is used by other people in laboratories and in calculations. I developed systems for collecting data and monitoring large experimental facilities (e.g., a dissolution refrigerator with a nuclear demagnetizing subsystem), data analysis and visualization in Python. Prior to this, Fortran used a lot and a bit of C ++ for numerical calculations. In any case, this experience was not enough for the transition. It was necessary to “pull up” or master a lot of new things both in programming (including algorithms and data structures), and in software design. I set myself a time limit of one year, that is, it was the deadline for my transition, since my contract was ending then, and I did not intend to renew it, although I had such an opportunity. He also “sketched” a small practical training plan that included, for example, creating Python games using scripts from the book “Programming Games and Puzzles” (Jacques Arsac, 1985) and various small programs from “The Programmers Idea” (Coders Lexicon). By the way, I still used the first book many years ago when I wrote games for Spectrum ZX. In parallel, he began to study two more programming languages ​​- JavaScript and Go. In principle, everything was going well, the benefit of training resources, books, examples on the Internet now is more than enough. However, after some time I began to feel the need for personal communication with people directly involved in software development, as well as in working together on a project. I began to look for such opportunities, given that I continued to work at the university and engage in self-education. And, of course, mitaps came to the rescue ...







Meetings, projects



One of the first groups was FreeCodeCampVienna, which mainly focused on web programming, but that was normal, since communication and contacts were important to me, as well as gaining experience in working together. After the third or fourth meeting, we organized a small group of five people (then only three remained) to work on the WebTags project, which allows you to post messages to friends or groups on any website. I did the backend, as I am not very friendly with web design, to say the least. Unfortunately, we brought the project only to MVP and made a couple of presentations at the meetings, but somehow it didn’t go any further, but I acquired the necessary knowledge. Like the other project, CarTalk - A community designed for car enthusiasts, launched in the Chingu Voyage-7 team (a free global platform for those who want to learn how to work on joint projects), was not completed. By that time, I had already started to visit other groups, including those related to datasines - Vienna School of AI, Deep Learning Meetup in Vienna, Vienna Data Science. My interests began to shift towards machine learning and AI, since there was a certain “scientific character” in them that could not interest me ...







A bit about data science



The ability not to move away from science and at the same time engage in programming intrigued me. I found and completed several courses in machine learning, AI, and working with big data. There was practice at the meetings, although not as much as we would like. However, the main priority for me was communication and contacts. By that time, it had already been about eight to nine months of the transition period I had set, and I began to redo my resume. Those who worked in science know that curriculum vitae is used in the scientific community, which includes almost all the details of scientific activity. We had to squeeze an eight-page CV into a one-page resume. On the one hand, the problem is small, because scientific CV has a lot of irrelevant information that interests no one but universities. On the other hand, it was not clear what information might be useful. In the end, he included almost everything related to software development and data analysis, as well as work on third-party projects. It took me about two weeks to process the resume. About two months remained of the time allotted by me and I began to apply for open positions. It should be noted that this was December - a month in which there are many holidays, and indeed, everything is usually waiting for Christmas and the New Year holiday. I also understood that my skills and knowledge for working directly, for example, as a machine learning engineer, were still not enough, so I focused on companies engaged in datasines, but with open developer positions ...







Search



And I started sending out my resume. In most cases, they answered fairly quickly - in two to three days. To my surprise, not everyone responded instantly. We were invited for a couple of interviews after telephone interviews and test tasks. However, then he received refusals. There were about ten of them. All this time I continued self-education and communication on the meetings. I listened, delved in, asked a lot of questions. Maybe someone bothered, but for the most part received friendly comments. I was then struck by the datasine community with its openness, which I have not seen in a "normal" scientific environment for a long time or very rarely. Interested in several topics, including the generation of synthetic data and the protection of private data. It looked like magic, actually. Creating a human face or generating an adequate C code for yourself using machine learning systems. As a child, I read in a book or magazine about a program that could write poetry. Then I wrote one for Sinclair Spectrum ZX, but it was just busting in arrays and substituting the right words in the right places. And here there is no hard-coded algorithm for creating something new that did not exist, but that makes sense! In a word - cool. I began to delve deeper into these topics, read the literature and search for algorithms. I wrote several letters to people who worked on this topic in Austria and around January I received an invitation to do a telephone interview with those. startup director. At the same time, I received an invitation from one of the companies that developed scientific software, but I wanted datasines ...







Sentence



After a telephone interview with the startup director, I was invited to an interview at the company’s office and sent me some practical tasks, one of which was just for algorithms, and the other was related to the analysis and transformation of data in Python. The tasks did not cause any difficulties, only the names of some functions seemed strange to me. As I later found out, these tasks were “extracted” from the real code originally written in R. The interview took place at the end of January, meaning that I could easily meet the planned deadlines and not renew my contract at the university. The interviews mainly talked about programming, discussed test tasks and general questions on adequacy and psychological stability. We talked a total of about two hours. Everything went as if we had known each other for many years - no tension. A few days later I received an offer for the position of Senior software engineer. And here it became scary. Firstly, I refused the offer in another company developing scientific software, but it was without much torment. The main problem was that I did what I intended. And now it was necessary to take the last (or first?) Step - to leave the science that I had been doing for the last twenty years. I had five days to decide, the longest in my life. But as they say, said “A”, say “B”. And after working for the required month at the university, I went to a startup ...







Startup



Today I can say that I'm glad that I started my new life not in a big company, but in a small startup. I like that I can communicate with every member of the company. Everyone is doing their job. And what's more, we know what we are doing and why. The first week was not the most difficult, I dealt mainly with bug fixes. But small companies cannot afford such a luxury, so ten days later I was already integrating the differential privacy algorithm into our “engine”, which required a deeper study and understanding of Tensorflow and Keras, as well as the algorithm itself. After that, the development and implementation of an encoder that could process files containing tens of millions of lines and several thousand columns of various types in a reasonable time. To do this, I studied Spark and cloud computing. Now SaaS implementation and integration ... Every day something new. Ten to twelve hours a day that fly by like one hour ...








All Articles