What happened on this momentous day?
On this day, the company MAPR promised to suspend its work if it can not find the funds for further functioning. Later, in August 2019, MAPR was acquired by HP. But returning to June, one cannot fail to note the tragedy of this period for the Big Data market. This month, stock quotes collapsed with CLOUDERA, a leading player in the designated market, which merged with the chronically unprofitable HORTOWORKS in January of the same year. The collapse was very significant and amounted to 43%, in the end, the capitalization of CLOUDERA decreased from 4.1 to 1.4 billion dollars.
It is impossible not to say that rumors about inflating the bubble in the field of technology based on Hadoop have been circulating since December 2014, but it bravely lasted for almost five more years. These rumors were based on the failure of Google, the company in which Hadoop technology was born, from its invention. But technology has taken root during the transition of companies to cloud-based processing tools and the rapid development of artificial intelligence. Therefore, turning back, we can confidently say that the demise was expected.
Thus, the era of Big Data came to an end, but in the process of working on big data, companies realized all the nuances of working on it, the benefits that Big Data can bring to a business, and also learned how to use artificial intelligence to extract value from raw data.
All the more interesting is the question of what will replace this technology and how analytics technologies will continue to develop.
Augmented Analytics
During the events described, companies working in the field of data analysis did not sit still. What can be judged from the information on transactions that occurred in 2019. This year, the largest transaction of the market was carried out - the acquisition of Salesforce of the analytical platform Tableau for 15.7 billion dollars. A smaller deal occurred between Google and Looker. And of course, one cannot fail to note the acquisition by Qlik - the big date of the Attunity platform.
BI market leaders and Gartner experts claim a tremendous shift in data analysis approaches, this shift will completely destroy the BI market and lead to the replacement of BI with AI. In this context, it should be noted that the abbreviation AI is not “Artificial intelligence” but “Augmented Intelligence”. Let's take a closer look at what is hidden behind the words “Augmented Analytics”.
Augmented analytics, as well as augmented reality, is based on several general postulates:
- the ability to communicate using NLP (Natural Language Processing), i.e. in human language;
- the use of artificial intelligence, this means that the data will be pre-processed by machine intelligence;
- and of course the recommendations available to the user of the system, which just the same generated artificial intelligence.
According to the manufacturers of analytical platforms, their use will be available to users who do not have special skills, such as knowledge of SQL or a similar scripting language, who do not have statistical or mathematical training, and do not have knowledge in the field of popular languages specializing in data processing and corresponding libraries. These people, called the Citizen Data Scientist, should only have outstanding business qualifications. Their task is to capture business insights from the tips and forecasts that artificial intelligence will give them, and they will be able to refine their guesses using NLP.
Describing the process of users working with systems of this class, one can imagine the following picture. A person coming to work and launching the corresponding application, in addition to the usual set of reports and dashboards that can be analyzed using standard approaches (sorting, grouping, performing arithmetic operations), sees certain hints and recommendations, something like: “In order to achieve KPI in terms of quantity sales, you should apply a discount on products from the category "Gardening" ". In addition, a person can contact the corporate messenger: Skype, Slack etc. He can ask the robot questions, in text or voice: “Bring me the five most profitable customers.” Having received the appropriate answer, he must make the best decisions based on his experience in business and bring the company profit.
If you take a step back and look at the composition of the analyzed information, and at this stage the products of the augmented analytics class can simplify the lives of people. Ideally, it is assumed that the user only needs to point out the analytic product to the sources of the desired information, and the program itself will take care of creating a data model, a bunch of tables, and similar tasks.
All this should, first of all, ensure “democratization” of the data, i.e. anyone can analyze the entire array of information available to the company. The decision-making process should be supported by statistical analysis methods. The data access time should be minimal, since it is not required to write scripts and SQL queries. And of course, you can save on highly paid Data Science specialists.
Hypothetically, technologies open up very bright prospects for business.
What replaces Big Data
But, in fact, I began my article with Big Data. And I could not develop this topic without a brief excursion into modern BI tools, the basis for which, often, is Big Data. The fate of big data is now clearly a foregone conclusion, and these are cloud technologies. I focused on transactions made with BI vendors in order to demonstrate that now every analytics system has cloud storage under it, and cloud services have BI as a front end.
Do not forget about such pillars in the field of databases as ORACLE and Microsoft, it is necessary to note their chosen direction of business development and this cloud. All services offered can be found in the cloud, but some cloud services can no longer be obtained on-premise. They have done significant work on the use of machine learning models, created libraries accessible to users, configured interfaces for the convenience of working with models from its selection to setting the start time.
Another important advantage of using cloud services, which is voiced by manufacturers, is the presence of virtually unlimited data sets on any topic for training models.
However, the question arises, how much does cloud technology take root in our country?