A / B testing, pipeline and retail: branded quarter on Big Data from GeekBrains and X5 Retail Group





Big Data technologies are now used everywhere - in industry, medicine, business, and entertainment. So, without the analysis of big data, large retailers will not be able to work normally, sales at Amazon will fall, they will not be able to predict the weather for many days, weeks and months ahead of meteorologists. It is logical that specialists in big data are now in great demand, and demand is constantly growing.



GeekBrains prepares representatives of this field, trying to provide students with both theoretical knowledge and to teach with examples, for which experienced experts are involved. This year, the Big Data Department of Analytics at GeekUniversity Online University and the largest retailer in Russia, X5 Retail Group, became partners. The company's specialists, possessing extensive knowledge and experience, helped create a branded course, whose students receive both theoretical training and practical experience during the training.



We spoke with Valery Babushkin, director of data modeling and analysis at X5 Retail Group. He is one of the best data scientists in the world (30th in the world ranking of machine learning specialists). Together with other teachers, Valery tells GeekBrains students about A / B testing, the mathematical statistics on which these methods are based, as well as modern practices for calculating and the features of implementing A / B testing in offline retail.



Why do we need A / B tests?



This is one of the best methods for finding the best ways to improve conversion, economic performance, and behavioral factors. There are other ways, but they are more expensive and complex. The main advantages of A / B tests is their relatively low price and affordability for businesses of any scale.



About A / B tests, we can say that this is one of the most important ways to search for and make decisions in business, decisions on which both profit and development of various products of any company depend. Tests provide an opportunity to make decisions based not only on theories and hypotheses, but also on practical knowledge of how specific changes modify the interaction of clients with the network.



It is important to remember that in retail you need to test everything - marketing campaigns, SMS mailings, tests of mailings themselves, the location of products on the shelves and the shelves themselves in the trading floors. If we talk about the online store, then here you can test the location of elements, design, inscriptions and texts.



A / B tests - a tool that helps a company, for example, a retailer, to always be competitive, to feel the changes on time and change itself. This allows the business to be as efficient as possible, maximizing profits.



What are the nuances of these methods?



The main thing - there should be a goal or problem, on which testing will be based. For example, the problem is a small number of customers at a retail outlet or online store. The goal is to increase the flow of customers. Hypothesis - if product cards in an online store are made larger and photos are brighter, then there will be more purchases. Then an A / B test is carried out, the result of which is an assessment of the changes. After the results of all tests are obtained, you can start forming an action plan to change the site.



It is not recommended to conduct tests with overlapping processes, otherwise the results will be more difficult to evaluate. The first is recommended to conduct tests on the most priority goals and formulated hypotheses.



The test should last long enough so that the results can be considered reliable. How much exactly depends, of course, on the test itself. So, on New Year's Eve, the traffic of most online stores is increasing. If before this, changes were made to the design of the online store, then a short-term test will show that everything is fine, the changes are successful, the traffic is growing. But no, because no matter what you do before the holidays, traffic will increase, the test cannot be completed before the New Year or immediately after it, it must be long enough to reveal all the correlations.



The importance of the correct connection between the goal and the measured indicator. For example, changing the design of the same website of an online store, the company sees an increase in the number of visitors or customers and is satisfied with this. But in fact, the size of the average check may be smaller than usual, so the total income will become even lower. Of course, this cannot be called a positive result. The problem is that the company did not simultaneously check the combination of increase in visitors — increase in the number of purchases — dynamics of the size of the average check.



Testing is only for online stores?



Not at all. In offline retail, a popular method is the implementation of a full pipeline to test hypotheses offline. This is the construction of the process, in which the risks of incorrect selection of groups for the experiment are reduced, the optimal ratio of the number of stores, pilot time and the size of the estimated effect is selected. It is also a reuse and continuous improvement of post-analysis effects methodologies. The method is needed to reduce the likelihood of errors of false acceptance and omission of the effect, as well as to increase sensitivity, because even a small effect on a large business scale is of great importance. Therefore, you need to be able to identify even the weakest changes, minimize risks - including incorrect conclusions about the results of the experiment.



Retail, Big Data and real cases



Last year, X5 Retail Group experts evaluated the dynamics of sales volumes of the most popular products among 2018 World Cup fans. There were no surprises, but the statistics were still interesting.



So, the "best seller number 1" was water. In cities that took the mundial, water sales grew by about 46%, Sochi turned out to be the leader, where the turnover increased by 87%. On match days, the maximum figure was recorded in Saransk - here the sales volume grew by 160% compared to ordinary days.



In addition to water, fans bought beer. From June 14 to July 15 in those cities where the matches were held, beer turnover grew by an average of 31.8%. Sochi also became a leader - here they bought beer 64% more actively. But in St. Petersburg, growth was small - only 5.6%. On match days in Saransk, beer sales increased by 128%.



Studies have been conducted on other products. The data obtained during peak days of consumption of products allow in the future to more accurately predict demand, taking into account event factors. An accurate forecast makes it possible to anticipate customer expectations.



During testing, X5 Retail Group used two methods:

Bayesian time series structural models with cumulative difference estimates;

Regression analysis with an estimate of the bias distribution of the error before the championship and during its holding.



What else uses retail from Big Data?







Lack of specialists



The demand for Big Data experts is constantly growing. So, in 2018, the number of vacancies associated with big data increased 7 times compared to 2015. In the first half of 2019, the demand for specialists exceeded 65% of the demand for the entire 2018.



Large companies are especially in need of Big Data analyst services. For example, in Mail.ru Group they are needed in any project where text data, multimedia content are processed, speech synthesis and analysis is performed (these are, first of all, cloud services, social networks, games, etc.). The number of vacancies over the past two years in the company has tripled. In the first eight months of this year, Mail.ru hired as many Big Data specialists as it did in the entire past year. At Ozon, the Data Science division has tripled in the last two years. In Megafon, the situation is similar - the team that is engaged in data analysis has grown several times over the last 2.5 years.



There is no doubt that in the future the demand for representatives of specialties associated with Big Data will grow even stronger. So if there is interest in this area, it is worth trying your hand.



All Articles