Hello ladies and gentlemen. I would like to share with you my thoughts on programs for automated trading on the exchange, in particular, the use of sentimental analysis in this area.
I would like to first talk about established approaches in relation to trading robots. Currently, the classic method used in trading robots is technical analysis. What it is? You have 2 axes, the x axis - the time is counted along it, and the y axis - the price of something is counted along it. Technical analysis analyzes the price chart, which is displayed using these two axes and predicts the future price. How can I do that? Well, for example, you can first approximate the chart (say, with the help of splines), and then interpolate it - in this way, we will try to predict the price of something.
And now on to the essence of the issue.
The problem I was thinking about is the problem of predicting the next price value. In technical analysis, the future value of the predicted price depends on the previous chart, that is, we determine the future value of the price, based only on the previous chart!
Once I saw the news, after which I thought about how to make crypto trading bots. On June 17, 2016, a DAO hack took place, based on ethereum. The cost of ethereum fell 2 times from $ 21.52 to $ 10.23 in 1 day.
It is clear that the reason for the fall in the cost of ethereum was the news of a hack
I can describe the current situation in the field of development of trading robots using such an example. I ask you not to judge strictly, this is the best I could come up with.
I ask you to include your imagination.
Suppose your supervisor asked you to predict the time when the car engine will die. And to predict when the engine stalls, you should be in the engine compartment of the car. At the same time, you have a team from a datacenter, a trader, etc. And at the same time you are all very small and fit in the engine compartment. Well, let's say that you are very small, and the engine is very large. When developing cryptocurrency bots, you have to guess when the price of cryptocurrency will rise or fall, here you have to guess when the engine stalls, do not find a match? So, you are sitting next to these guys, the datacenter writes down the values โโof the pressure readings in the tubes that fit the engine, on the x axis he records the time, and on the y axis the pressure readings. Based on this, he builds a schedule and tries to predict when the engine will stall. The trader leaned his ear against the engine and said: โI heard that the engine sounded like this last week, when it started to stall, so it will start to stall now.โ And outside the front of the car are guys with hammers and beat with hammers in the area where the engine is located on the hood. The guys with the hammers can conspire, and at one point in time hit the hood together in the engine area, and then it stalls. Do you understand? You are trying to predict the future state of a system that depends on external data using internal data! How can I find out when the engine stalls? Well, for example, you can open the hood and listen to what the guys say with hammers.
Thus, I came to the conclusion that in order to create a crypto trading bot, you need to analyze the news. But how can a computer analyze news, because it does not understand their essence? Recently appeared libraries that allow you to evaluate the "mood" of the text. By โattitudeโ is meant an assessment of the text. Say the sentence โThe war has begun.โ Is negatively colored, and the sentence โThe war has ended.โ Is positively colored. Accordingly, a program that conducts sentimental analysis receives a line of text, and the program, in turn, gives an estimate in the range from -1.0 (negatively colored text) to 1.0 (positively colored text).
Well, then I will describe the prototype of a crypto-trading bot using sentiment analysis, which I wrote.
So what do we have for sentimental text analysis? I write the prototype in python, and there is a textblob library that can just give a sentimental assessment of the text. Great, move on.
Next we need news. What is the oldest cryptocurrency news site? Coindesk. It has been working since 2013. 6 years of news. Great fit. To test the textblob library, I wrote a program that gives a sentimental mark for text that is in a text file. And I manually tried to insert the text of articles with coindesk into a text file.
Results: a sentimental assessment was higher for positive news, that is, as intended. Next, you need to scrap all the news from the site. The topic of website scraping is quite extensive in itself, I just want to mention that at first I got all the links to the articles on the site, for this a search spider was written to collect links. After all the links were collected, I wrote a program for scraping coindesk. Based on the links received, the program for scraping parsed each news into 3 components: date, news headline and news text itself. Some articles did not go through parsing (for example, announcements that a job was open on coindesk and they are looking for an employee for this position, in which case the scraping program should handle such situations). I must say that in total about 13,000 news came out. Then you need to save all the news in the format I described on the ROM.
Next, we need cryptocurrency prices since 2013. Some cryptocurrencies in 2013 did not exist yet. Well, we simplify the task and we will predict the price of bitcoin. In order to find prices from 2013 to 2019, we had to download datasets from 4 different sources, and then combine them into a time-ordered data structure, we had to download data from several sources, since we need a continuous data array. After the time-ordered data structure with prices is formed, save it on the ROM.
What to predict (Bitcoin price from 2013 to 2019)
Now you can make a mathematical model. How the matmodel works: articles are taken per day and it turns out which of them are related to bitcoin (for this, a search in the news headline for the string โbitcoinโ or โBTCโ was programmed). For each article that is related to bitcoin, we get a sentimental assessment, if there are several articles that are related to bitcoin, then we take the arithmetic average of sentimental estimates. A sentimental estimate of more than zero means that the price of bitcoin will rise tomorrow, a sentimental estimate of less than zero means that the price of bitcoin will decrease tomorrow. As a result, a special data structure was formed, where each day the price of bitcoin and a sentimental assessment of the direction of price movement on the next day were compared, this data structure must be stored on ROM.
Ok, look what happened.
Three types of points are superimposed on the Bitcoin price chart: gray - no bitcoin news was found on this day, green - the assumption that the price will rise the next day, red - the assumption that the price will fall the next day.
Further, in the matmodel, you need to add indications of on which day the bot made the correct assumption about the price, and on which - the wrong one. This can be done, since we have a saved data structure with a comparison of the date with the price and a sentimental assessment of the direction of the price movement the next day.
This graph reflects whether the assumption about the Bitcoin price movement was correct or not. There are 2 types of points superimposed on the Bitcoin price chart: green - the assumption about the direction of the Bitcoin price the next day was correct, red - the assumption about the direction of the Bitcoin price the next day was incorrect.
The model shows that the prototype bot guessed the direction of the Bitcoin price movement of 832 days, and did not guess 725 days.
I also checked what would happen with profitability if this prototype bot could complete trade transactions in 2013 - 2019. The trading algorithm for the prototype was this: the very first day of trading, the bot had 1 BTC on its account. Further, if from the received data it is clear that bitcoin will fall in price, then we sell BTC, we buy USD, if from the obtained data it is clear that bitcoin will increase in price, then we sell USD we buy BTC, if there is no news about bitcoin, then we will not take any actions, as there is not enough information to take action. If the bot started trading with 1 BTC, then by the end of the test period it would have approximately 1.05 BTC.
Nothing prevents you from scraping all the news from cnn.com since 1995 and comparing all the news with the stock price on the stock exchange (well, for example, on the NYSE), and for a sentimental evaluation use a neural network, which again will be trained on the news. A neural network trained in this way can be used to analyze the news flow in real time, and based on this neural network, a trading robot can be made for the same NYSE. Anyone who does this first will get a chocolate bar.