AI Testing and Startup: Interview with Adam Carmi (Applitools)







There is a popular phrase “scratch your own itch”: if you want to create a new product, do one that you yourself lack. In this case, you best understand how to do it well.







Adam Carmi was keenly aware of the lack of a visual testing tool that would help people not break their eyes in search of a traveled layout. And in the end, he created such a tool, adapting AI for this, and became one of the co-founders of Applitools. It sounds like a dream job: when you struggle with the pain you know, you feel that you are changing the world for the better. But what difficulties does an IT specialist face when the fate of an entire company depends on him?







And since the Applitools tool itself also needs to be tested, Adam learned a lot about testing projects with AI. Tomorrow at Heisenbug, he will talk about how to do this, and his report will be broadcast live - so everyone can watch it live. In the meantime, we asked him about both topics: how it is to create a company and things related to testing and AI.







Startup life



Evgeny Trifonov ( phillennium ): You were an employee for many years, and then you decided to create a startup. How did this happen and what was the impetus?







Adam: My previous place of work was a company related to information security. I worked there for 8 years. I think you can imagine how many UIs there are in a security product. A huge number of logs, charts, graphs, queries and alerts. And around all this was a lot of unobvious UI.







Everything was complicated by the fact that this UI was translated into 6 languages ​​and supplied under five different brands. To test it all manually from start to finish in all variations, it took something about a week. There were 20 testers who did this every iteration. That is, if a release requires several iterations, then this requires a release cycle of at least 2-3 months.







Therefore, in those years when I worked there, I did not have enough solutions for this problem. Of course, we were engaged in automation. Selenium was then still a young product, we used it, but it did not cover the UI. I constantly explained the problem to the vendors of the time (HP, Microsoft and IBM) and asked for a solution. The answer has always been one: it is impossible. To verify that the interface looks as it should (rather than just “functioning as it should”), manual testers will always be needed.







Listening to this answer for years, I decided to engage in such an instrument for my team as a side project. I have been writing code since 10 years old, I really like it, I can do it. And therefore, even when I led a large team, I always had pet projects: I wrote games for my children, then I just messed around with interesting things. And I really wanted to solve this problem. I immediately saw how difficult it was, but it only spurred me to work harder and come to a decision.







In about a year of work, I managed to lay the foundations. And by this time, the company where I worked was bought by another. And I decided to go about my own business instead of moving to another.







In general, my motivation was to work on something that interested me at the technological level. I had no ambition to become a great businessman. It was just a matter of being able to work on what fascinated me.







Eugene: When IT people found companies, often it’s not the technology problems that they face, but the business side. You had experience in management positions - how much did it help? Do you recommend getting it before setting up your company?







Adam: To begin with, I want to note that one should not confuse management experience with business. These are very different things.







Being a techie is a separate skill. You attract talented people, inspire them to work hard, make sure that they remain in your team for many years and the work captivates them. I believe this is largely due to being a good engineer, and not related to business.







There are people who believe: since I am creating a startup, then I will be CEO. I myself know everything, no one is my adviser. Many people think and fail. Even those that have become very good CEOs. You can make any mistake on this way.







My position was completely different. I immediately stated that I had no idea how to be a CEO. Therefore, I said: let's find someone who already knows what to do and who has relevant experience in this matter. And let him be CEO, and I will be responsible for the technical component.







This does not guarantee at all that in this case everything turns out, but at least I do not waste time making mistakes that can be completely avoided if an experienced person deals with this.







Before the company I worked for 8 years was swallowed up, our CEO came to tell me about this, we started thinking about plans for the future, and I told him what I was already working on at that time. He immediately became interested, investigated the situation properly, decided to join me, and since then we both work at Applitools.







I repeat, I believe that an engineer is not the best option for a startup. The odds will not be on your side. It is worth finding someone who knows what he is doing. This will increase the chances of success, although it does not guarantee anything.







Eugene: Got it. And in this situation, when complex business tasks were on another person, what is the most sweating thing for you?







Adam: It made me sweat ... I don’t want to go into detail about the difficulties specific to Applitools. I think this: it’s great that budding entrepreneurs are very naive. Of course, this is a cliche that everyone repeats, but as long as you don’t do it yourself, you don’t even imagine the pressure, uncertainty, psychological ups and downs that you go through with the company. On the same day it may seem that you will conquer the world and that you will have to fire everyone. It takes time to adapt to this and begin to see things in perspective. It is very difficult.







Well, there are the usual difficulties - for the product to work properly, to deal with the engineering component, working hard.







Mikhail Druzhinin ( xomyakus ): It sounds as if development is a simple part. She is at least predictable.







Adam: Exactly.







During the first part of a startup’s life, when it is not clear whether he will survive, it’s not easy to find yourself in a situation where you find that you no longer have money. You already climb into your personal savings to pay salaries to employees. Those whom you previously convinced with the help of your charisma to leave their places and work for you for a lower salary simply because they believe in you.







But even when you left the survival mode, you have an excellent product and customers, if you raised funds, you have investors who will wait for profit from their investments. Now there is constant pressure to grow and develop at a very high pace, and this requires creativity and a lot of work, because you have to cope with this growth.







Let's say your sales amounted to X million per year, and you rejoice in success. But next year you should sell twice as much, how to do it?







Mikhail: You said about creativity, what exactly is meant by this?







Adam: Usually you think: I made a great product, now the whole world will use it.







The reality is that the world is actually very busy. The world has no idea about your existence. The world always has 10 different things to do, and you cannot control where you will be in this list. And you don’t have much money to get to know about you. Money is the budget for advertising, conferences, webinars, personal sales. All this costs just a ton of money.







And creativity here means thinking beyond stereotypes and using approaches that require little money and resources. Applitools can do this, it allows us to take a new height every year. But every time we have to go beyond.







Michael: Exactly, you need to think more broadly. I notice that many developers and testers think very straightforwardly, they see only one solution to the problem. It takes five months and a lot of resources for testing. Then they are told: you know, there is only a month left, and then we will all die. This is where creativity begins!







Adam: Yes, of course. There is also creativity, which concerns the product itself: you always need to keep abreast, do some things faster and more efficiently. This goes without saying.













Eugene: Returning to the topic of growth: how many people today work in Applitools, and how fast is this number growing?







Adam: 110 people work today. And at the beginning of 2018, there were about 20 of us. The growth was rapid, five times in a couple of years.







Eugene: Impressive, yes. But was it difficult at such a rate of growth, when many new people come, to maintain the culture of the company?







Adam: Great question. From my own experience, I realized how important a well-defined company culture is. It seems that I could now make a whole report on this topic.







To begin with, the concept of corporate culture in Israel is not very developed. If someone is trying to do this, people have a reaction "oh, this is corporate bullshit."







And for us it has become a starting point. What did I decide to do with Applitools as an R&D manager? Looking back on my experience in other companies, I wanted to conduct an experiment: in general, I won’t be allowed to lower the bar when recruiting people, just take excellent specialists and that’s all. No compromises. It was very difficult.







Of course, the first employees are those whom you personally know and whom you convinced to go to you. But after that it becomes very difficult to grow. Sometimes it takes us 6-8 months to find the right employee.







But over time, it becomes easier, because your team already has a lot of strong specialists, and they know other talented guys. And when these talented guys start looking for work, they see the names of your company employees - and there are solid speakers of international conferences, well known in the local community. And then it becomes easy: they are more interested in going to you than in some big company.







This allowed us to create a unique development process, which is even difficult to call a process. Each of our developers has a responsibility for everything that happens. We do not have sprints, planned releases, we do not even require the product team to provide a full specification.







It is enough that you have an idea, and you are responsible for its implementation. You yourself are looking for the information you need, you yourself are looking for people to help you. If you don’t know how to do something, you are responsible for learning how to do it.







So we managed to create something special. And when we attracted a large amount of investment and knew that we would grow, I was very concerned about it. How to preserve our uniqueness, and not destroy it during growth, when the engineering team triple?







The decision was to fix this process and clearly define the culture.







I started by hiring a very good HR who worked for Dropbox. In fact, she assembled the entire Dropbox team, and now it is considered one of the best places to work in the world. She is very experienced. We just sat down and began to write a draft of our new culture.







We presented our achievements to the Timlids, who completely rejected all this. And they got very angry. And then the dialogue began. In the process of discussions with a large number of feedback, we were able to formulate what it means to work in our company. We got a list of values ​​with which all team leaders agreed.







It took several months, but in the end, when we presented the result to the whole team, people immediately picked it up. They immediately felt that the words express exactly why working in our company is so exciting. This time we got a great feedback.







Now we religiously safeguard these values ​​and ensure that when expanding the state, there are no conflicts with them. It helps us in making decisions and maintaining the atmosphere, and I hope that in the future it will continue to help.







Eugene: Another interesting detail about Applitools: your headquarters are in Silicon Valley, but R&D is in Israel. Can you tell why so?







Adam: First of all, when you start a company, at first you work from home, from a cafe near your home, or traveling.







My company was founded and grew in Israel, because there is a lot of IT. And just because of our activity in Israel, we managed to spread to the rest of the world. When a large company has a team in Israel, and they use some kind of tool in this team, as a result, other teams in this company (in the USA, Great Britain, elsewhere) see this. And if you do this tool, your product suddenly has foreign customers, although you have not been engaged in marketing or sales abroad.







But from some point you want to be closer to your customers. And as a company that attracts investment, you want to be closer to investors who have the appropriate amount. Israeli funds mostly work with startups in the early stages, so it’s more difficult to get large amounts from them. Or their investment is less useful because they do not have such connections as venture funds from the United States. Being close to these people and maintaining relationships with them is helpful. After all, I want to be in a situation where investors want to invest in you and turn to you (and you agree to accept the money or not), and not you running after them.







I also want to be closer to the client, and we already have dozens of sales managers. In which region of the USA can business go most actively with us? Our office for these employees is located there, and it is also the headquarters located next to investors.







And R&D, starting in Israel, remains in it. There are excellent specialists here, and at the same time keeping a team of engineers here is much cheaper than in Silicon Valley, and the competition for talent is not so high. For all these reasons, I am happy to stay here in Israel, heading the Israeli office. This is good for me and for the company. And my two partners have been in San Francisco for almost four years now.







Michael: And in addition to the high bar you mentioned when hiring, do you use any other tricks to prevent the quality level from falling with rapid growth?







Adam: I can tell you what I'm doing now (maybe this will change in the future).







I strongly believe that a good developer makes really working software. He does not create such software with which he has already finished, but others still need to test it. It is his job to make it really work. I don’t care how he does it. It doesn’t matter if he tests everything manually each time or automates testing. I really don't care. But this is his work and his task.







In general, the guys who write the algorithms and the backend are more satisfied with this state of affairs. But the front-end is not very happy, because testing the UI is much more difficult than the API with unit tests.







There is one problem with all this. Of course, you most likely may have a developer who understands that he must test everything that he writes. He knows that this is his job, and agreed to it. But when you ask, “Is everything well tested?” And he answers “Yes,” this is a very subjective answer. He did something, but it may be 50% of what was supposed, and he wrote to himself in a notebook “after the release it will be necessary to return to this”.







And even a code review will not give us a complete picture of what is happening, because the employee who did the code review was also busy. He says that there were some tests, and he looked at some of them - everything is fine, let's move on.







That is why I have a quality director (not QA).







He has his own team. His goal at the company is to make sure that everything is tested and that the developers really covered everything. Simply put, everything about quality should be tested. The quality director has a lot of authority and freedom. If he tells me that something is not covered enough, we will not undertake anything until we figure it out.







His team also helps the development team fill in the gaps. Sometimes problems are discovered retrospectively when the corresponding team is busy with some features and they don’t have time to switch - then the quality director’s team can take on these tasks.







And yet this team is responsible for end-to-end testing, affecting different teams and products. This is a zone that is difficult for engineering teams to cover well: they can figure out what they are responsible for themselves, but when something affects several teams or products at once, it is difficult to expect that one of them will cope with this and will undertake to answer for all. It doesn’t work like that. Therefore, the quality director is responsible for such complex tests. He can come and tell me “I need engineering resources for that,” because for some part of the test he may not have them.







This is how it works in general, although there are aspects. When you work in a company of five people, they do not matter, but when you have an office in another part of the world, and there is support, a lot of communication suddenly arises in connection with each release. If something does not work as it should, or if something new appears, customers start asking support questions. And if they don’t know well enough what has changed, then they will not be able to respond well.







So part of the effort is devoted to creating communication: very informative reports that answer all the main questions about the changes, and also show me that everything has been tested. If not everything, then where are the gaps? And for such general reports, the quality team is also responsible, making sure that everyone knows what is happening. This provides the company with a sense of high quality - not just in terms of testing and coverage.







Michael: You make a testing tool, but how do you test this tool itself? How Applitools is used to test Applitools?







Adam: Both all visual testing and functional UI testing in Applitools is done using Applitools. Of course, there is something that is tested manually, and that it makes no sense to test through the UI and is covered through the API. But with the visual component, we have solid dogfood.







We will release it several times a day, but we can’t do this so that the user sees the changes. The development team should not change the user interface without going through the proper support notification process so that they know how to answer questions.







Therefore, it usually happens like this: if the UI changes, these changes are developed in a separate branch or masked by a flag, but the backend changes are poured several times a day. We have periodic releases with big features visible in the UI. We also have a roadmap to prepare customers for changes, we are developing a mapping campaign and writing documentation for all changes.







And part of this process - before the big release, in addition to the autotest, we test a little everything manually, and we also actively resort to research testing. The week before the release, we give the final version to support, they are taught to use the changes. When they try everything themselves, they know everything about the changes.







That’s the whole so-called process. Of course, there are some parts when the UI is tested manually, although basically it's all automatic testing.







I really like (and the companies, I think, too) that we actively use our own product. Each of our team can choose any technology, but still prefer our own. It’s also great to feel that you are doing something that the whole world uses - companies like Apple, Netflix, Dropbox, IBM.







Testing and AI



Eugene: since it came to testing, let's move on to your report. It is called "AI and testing", and this can be interpreted in different ways: both as "Let's test something with the help of AI", and as "Let's test the AI ​​itself." Let's first dot the AI: what is the talk about?







Adam: Its main part is about how to test AI. If parts of your product are non-deterministic and in some way contain AI, how do you test this product? There are various methods and practices that are worth using - to, say, increase the likelihood of quality.







If there is time left, then in the final part of the report I’ll talk about the opposite - where it is appropriate to use AI in testing. But 80% of the report on how to test AI.







Eugene: The concept of "AI" is not the most strictly defined, different people use it for different things. How is it defined in the context of your report?







Adam: The first slide there is about that.







I recall the definition of "machine actions, which, if they were performed by a person or an animal, we would call requiring intelligence." In this case, it turns out that AI can be implemented with just a few lines of code. If we teach a dog to bark once on intersecting triangles and twice on non-intersecting ones, everyone will agree that it is brilliant. And this problem can be deterministically solved in four lines of code.







But then I show how, as the task becomes more complex, the algorithmic approach ceases to work well. I’m talking about traditional machine learning, its advantages and disadvantages.







And then about where he is already missing, and I turn to deep learning. I show how all this complicates the development. It seems to people magic, and magic is difficult to test. And despite the fact that it gives advantages, it brings a lot of difficulties.







And from different ways to implement AI, we move on to different ways to test it. So for me, AI is everything that emulates intelligence, from simple algorithms to deep learning, but in the report I will go deeper into the deep.







Eugene: Is AI used to test AI?







Adam: No. Although we use Applitools to test Applitools, we are testing the frontend this way, not the AI ​​part. And I have not heard anyone else test AI using AI.







I don’t want to get into too much detail and retell my entire report in the process, but the main problem of statistical models is that it is a “black box”. And besides, when you retrain them, they already turn into other "black boxes".







The purpose of testing is to know what you have done. And if you test one black box with another box - it is not clear what happened, the test could pass, but there is no certainty.







Eugene: In addition to the visual testing that Applitools does, is AI applicable to some other areas of testing - for example, security?







Adam: Yes. AI is useful for pipelined tasks. I have formulated three general criteria when AI is applicable to solve a problem. They will be included in the final part of the report if time is left for them - but I will tell them now:









Let's take a trivial example - “self-healing tests”. The idea is that when changing the UI, the test itself can figure it out, not “breaking down”, but by learning new locators on its own.







Why is this easy to do with AI? Because this situation is suitable for all three criteria. First: you have all the data, because this is your test.







Second: your current process, where people deal with this problem, leaves much to be desired. There, a person chooses a specific locator - the best, but one. But you can find an element using this locator and automatically save ten locators for it in the database. In subsequent runs, some of them will turn out to be some kind of dynamic data, that is, nonsense by which the element cannot actually be found - but then we can simply discard them and leave the “survivors”. And then, when the UI changes, instead of one locator, we will have a whole series by which you can find the element and update its locator.







The third criterion: in the worst case, when this does not work out, you simply find yourself in a state in which you would be without any AI.







In general, if all three criteria apply, you can use AI. If they do not fit, do not even try! It will be a nightmare.







In our case, the difficulty is that the companies using our product expect you to find every bug to the last. If something slips away, blame us. But achieving a level where nothing escapes is extremely difficult. This requires many years of work, expertise and huge amounts of data.







Michael: This is interesting. Often in life, I see that some process is very far from ideal, and it can be improved to some extent. But people don’t want “to some extent”, they want to go from nothing to the ideal right away. And they say "you know, if the exact result is not guaranteed, then what is the point, then we will continue to do everything manually."







Adam: But there is a point. This can save a ton of time. For example, you have logs about fallen tests where all errors are recorded. Some of the mistakes are important, others are not very. But it’s known how to read these logs, there are patterns. First of all, it is worth looking at what happens infrequently or looks extraordinary. This may not give a completely accurate picture, but you will see the important part right away, and not after two hours of study. She can help, right?







And if it doesn’t help, then you can already study the entire log in order and eventually find it. That is, then you simply find yourself at the point from which you started. And there are many such examples that increase productivity: not by 100%, but with saving hours of work every day. So it’s clearly worth it. There are many similar opportunities where repetitive work occurs.







Eugene: Ingo Philip’s keynote will also be on Heisenbug about whether bots will be stolen from testers. But judging by what has been said above, can we safely say now that they will not steal it?







Adam: Of course. I would never trust the machine to fully test everything for me, not understanding what it was doing at all. This does not mean that bots and the like have no value. It is, but this value is additional to human activity, and not a substitute for it.







And they resort to these methods in cases when they do not require additional labor costs. I will give an example. In principle, a specific project can be brought to the point where a crawler made specifically for this project can go through the entire application from start to finish. It does not even require huge resources. But no one does that. But everyone is happy when there is some kind of universal bot for any project that can simply be connected. Why? Because you do not need to do anything.







Michael: My last question is a philosophical one. And what is quality for you?







Adam: Wow. A deep question, they don’t ask me the usual one.







I think in the end I want to delight customers with my product. When I hear that someone is unhappy when using it, it saddens me. I really feel that this is my art, my creation, and if the experience of people with it turns out to be not what I intended, then this is a lack of quality.







This is important, but there are other levels. There is also quality in the code itself. What it looks like, what technologies are used, how modern everything is, the level of people working on it. And everything that does not correspond to my utopian idea of ​​the company and the engineering team, in this respect does not meet my quality standards.







And as for the level of customer satisfaction, this is a very measurable thing. If they use the product more and more actively, then you understand that in the long run you are doing well. This is the main thing.







Adam Karmi will give a talk on the first day of Heisenbug, December 5th. Now all tickets for Heisenbug are already sold out, however the report can be seen in open broadcast . And for those who are interested in Heisenbug and do not want to limit themselves to this, there is also a paid broadcast in which all the reports will be available (and not just the first room on the first day, like the open one).



All Articles