Time series in forecasting demand, load on CCs, product recommendations and finding anomalies

The article discusses the scope of time series, tasks to be solved, and the algorithms used. Time series forecasting is used in tasks such as forecasting demand, load on the contact center, road and Internet traffic, solving the cold start problem in recommendation systems and searching for anomalies in the behavior of equipment and users.



Let's consider the tasks in more detail.







1) Forecasting demand.



Purpose: to reduce warehouse costs and optimize staff work schedules.



How to be solved: having a forecast of purchases of goods and the number of customers, we minimize the amount of goods in the warehouse, and store exactly as much as they buy in a given time range. Knowing the number of clients at each moment in time, we will draw up an optimal work schedule so that with a minimum of costs, there will be a sufficient number of personnel.



2) Prediction of the load on the delivery service



Purpose: to prevent the collapse of logistics at peak loads.



How to be solved: predicting the number of orders, bring to the line the optimal number of cars and couriers.



3) Forecasting the load on the contact center



Purpose: at a minimum cost of the wage fund to ensure the required availability of a contact center.



How to be solved: forecasting the number of calls over time, we will make the optimal schedule for operators.



4) traffic forecasting



Purpose: to predict the number of servers and bandwidth for sustainable operation. So that your service does not fall on the day of the premiere of a popular series or football match;)



5) Prediction of the optimal time of cash collection ATMs



Purpose: minimizing the amount of cash stored in the ATM network



6) Cold start solutions in recommendation systems



Purpose: To recommend relevant products to new users.



When a user has made several purchases, a collaborative filtering algorithm can be built for recommendations, but when there is no user information, it is optimal to recommend the most popular products.



Solution: The popularity of products depends on the time when the recommendation is made. Using time series forecasting helps to identify relevant products at any given time.



Life hacks of building recommender systems we reviewed in a previous article .



7) Search for anomalies



Purpose: to identify problems in the operation of equipment and non-standard situations in business

Solution: if the measured value is knocked out of the confidence interval of the forecast, an anomaly is detected. If this is a nuclear power plant, it's time to increase the square of the distance;)



Algorithms for solving the problem



1) Moving Average



The simplest algorithm is the moving average. Let's calculate the average of the last few elements and make a prediction. In the weather forecast for more than 10 days, a similar approach is used.







When it is important that the last values โ€‹โ€‹in the row contribute more weight, we introduce the coefficients depending on the remoteness of the date, obtaining a weighted model:







So, you can set the coefficient W so that the maximum weight falls on the last 2 days and the input.



Cyclic factors



The quality of recommendations can be affected by cyclical factors, such as coincidence with the day of the week, date, preceding the holidays, etc.





Fig. 1. An example of the decomposition of a time series into a trend, a seasonal component and noise



Exponential smoothing is a solution to the consideration of cyclic factors.



Consider 3 basic approaches



1. Simple smoothing (Brown model)



It is a calculation of the weighted average for the last 2 elements of the series.



2. Double smoothing (Holt model)



Takes into account the trend change and fluctuations in the residual values โ€‹โ€‹around this trend.







We calculate the prediction of changes in the residuals ยฎ and the trend (d). The final value of y is the sum of these two quantities.



3. Triple smoothing (Holt-Winters model)



Triple smoothing additionally takes into account seasonal variations.







Triple smoothing formulas.



ARIMA and SARIMA Algorithm



A feature of the time series for the use of ARIMA is the connection of past values โ€‹โ€‹related to current and future.



SARIMA is an extension for seasonal series. SARIMAX is an extension that includes an external regression component.



ARIMA models allow you to simulate integrated or differential-stationary time series.



ARIMA's approach to time series is that the stationarity of the series is evaluated first.



Next, the series is transformed by taking the difference of the corresponding order and some ARMA model is already being constructed for the transformed model.



ARMA is a linear multiple regression model.



It is important that the row is stationary, i.e. the mean and variance did not change. If the series is unsteady, it should be reduced to a stationary form.



XGBoost - where without it



If a series does not have an internal pronounced structure, but there are external influencing factors (manager, weather, etc.), then machine learning models such as boosting, random forests, regression, neural networks and SVM can be safely used.



From the experience of the DATA4 team, forecasting time series is one of the main tasks for solving the optimization of warehouse costs, personnel costs, optimizing the maintenance of ATM networks, logistics and building recommendation systems. Sophisticated models such as SARIMA give a high-quality result, but require a lot of time and are suitable only for a certain range of tasks.



In the next article, we will consider the main approaches to the search for anomalies.



In order for the articles to be relevant to your interests, take the survey below, or write in the comments about what topics to write the following articles about.



All Articles