Trends and Forecasts in the Field of NLP (Natural Language Processing)
This article is about trends and forecasts from our September Almanac “Artificial Intelligence” No. 2, which is devoted to a review of the market of technologies and companies in the field of NLP and speech recognition-synthesis in Russia.
For this study, we conducted an expert survey of the industry and, in particular, asked a question about forecasts of the development of both technologies and the use of AI in various fields. There were not many answers, but nevertheless, a general trend can be formulated. In this article, we summarized these answers and showed the main trends.
General technology trends
End-to-end NLP Problem Solving
More and more decisions will be based on the end-to-end approach, for example, the neural network model receives an acoustic signal (sound waves) at the input and produces an acoustic signal at the output, without an intermediate phase of the text. This will significantly speed up the execution of models and their quality, while worsening the "transparency" and our understanding of "what's inside."
Approaching the quality of speech recognition and generation to human
In the coming years, a significant improvement in the quality of speech recognition will be achieved. The number of errors in recognition will approach the human level. Recognition of mixed speech of several people speaking with different accents in a noisy environment will improve. An analysis of sound scenes with recognition of the gender and age of the speakers, the emotional coloring of their speech and the nature of the environment will be added.
Synthesized speech will be indistinguishable from human speech, and it will be possible to synthesize the voice of any person.
Multilingualism
In the near future, multilingual translation models will appear, including due to the use of transfer learning and due to the use of significantly larger monocorpuses in addition to parallel buildings. As a result, the quality of translation for low-resource languages will significantly increase (with relatively small arrays of training samples).
Manual translation will be completely superseded by machine translation due to a deeper machine understanding of the context and subject matter of documents. With the growth of speech recognition and speech synthesis technologies, machine simultaneous interpretation will appear on the horizon of 5-10 years.
Understanding the meaning of texts
Other applications based on understanding the context-specific meaning will appear on the same horizon in 5–10 years: various kinds of dialog and help services that can understand the context of the dialogue, intelligently answer user questions and direct the dialogue in the right direction. A deeper machine understanding of the language will take to a new level the automatic processing of text streams on the Internet and in social networks: the collection and compilation of facts, their analysis for consistency and reliability.
Text Generation
End-to-end neural networks will universally replace the classic NLG pipeline. The use of GPT2 level models already makes it possible to create fairly long articles on arbitrary topics in a given area with controlled content. On a 5-year horizon, neural network models will be able to generate texts no worse than humans. And then automatic content will flood the world.
Platforms and Cross-Platform
Many solutions will become standard, there will be many platforms for building applications based on voice interfaces. Cloud platforms will improve in terms of response time, workloads and security. Investment growth is forecasted not in separate interactive services (chatbots), but in multifunctional platforms and cross-platform solutions, thanks to which the voice assistant will be able to work equally on different devices. As a result, we will be able to start a conversation with our assistant in the “smart home”, continue it on the road in the car and then at work with our computer at the workplace, all without losing the communication context.
Small Data Technologies
The value of machine learning methods that work effectively in conditions of a small amount of raw data will grow: transfer learning, knowledge transfer. In such applications, the wider use of GAN (generative adversarial networks) for generating data for model training is also expected.
Architectures with less computing requirements
With the transition of neural network models from the walls of laboratories to commercial data centers, the requirements for their energy efficiency will increase. New, more efficient computing architectures are expected. For example, sparse networks combining the best qualities of distributed and symbolic computations, whose complexity models adapt to the amount of training data.
Market trends
Ubiquitous implementation of voice interfaces
The development of speech-to-text technologies will be the first step to simplify office tasks (for example, planning a manager’s time, searching for documents, and processing confidential information). With an increase in recognition accuracy, depth of understanding and quality of speech synthesis, voice interfaces will be integrated into almost all devices: dialogue systems in a smart home, car, household appliances, avatar bots, assistant bots.
Explosive growth of voice robots
We are waiting for the explosive growth in the number of intelligent assistants in various business sectors, including commercial services of banks, retailers, telecoms and other companies that actively interact with customers. All verbal communication with a mass audience in the most popular services will be conducted by robots. Robots will learn to sensitively recognize emotions, including using multimodal assessment of emotions and will themselves use the emotional component in a conversation.
Natural Language Information Search
There is a growing demand for intelligent search with the ability to make queries in a natural language. More and more organizations want to quickly find unstructured data in all internal sources, automatically determine their content and highlight significant facts in specialized legal or financial texts. Due to the development of deep models for extracting facts from texts and abstracting their contents, the quality of information retrieval will significantly improve.
In the home
Apparently, large companies - banks, telecoms, industry - will develop and increase their own expertise in the field of AI, including conversational with their own team of linguists, data scientists, NLP engineers, etc. Examples of outsourcing individual tasks in the near future will remain small. We are seeing rapid growth in the AI teams of many large companies. Good or bad is a topic for a separate article, but this is a clear trend.
Industry Trends
Finance and Insurance
In the short term, banks will focus on maximizing the benefits of data already accumulated by banks using AI in general and NLP in particular. In the long run, there is a steady trend towards the unification and simplification of banking processes that can be carried out without or with minimal participation (opening an account, risk assessment, creating a credit dossier, scoring, etc.). NLP will be combined with other technologies (computer vision, RPA, remote identification, etc.).
Industry and Logistics
Thanks to NLP technologies, one can expect a new generation of designers of project documentation, as well as the emergence of systems that evaluate the consistency of documents describing complex technical objects. Further, it is possible to predict the emergence of automated control planning systems based on an analysis of project documentation and standards using NLP.
With the advent of systems for understanding the meaning of texts, on the horizon of 5-10 years, a final solution to the problem of normalizing nomenclatures is expected.
The medicine
The widespread introduction of voice interfaces will significantly free the doctor from text entries and create automatically marked up medical histories. The appearance of large marked corpus of texts will make possible the emergence of SPPVR (medical decision support systems) of a new class based on NLP technologies.
IT and telecommunications
The widespread use of voice biometrics technologies (authentication and human voice authorization) is expected to provide services based on personalized data. Telecom operators will have the opportunity to occupy a unique position in the ecosystem of digital services, having a voice channel of communication with the client. On the other hand, voice messengers rely on the same basic technologies for speech recognition and synthesis. We are waiting for an interesting time of battles of the giants of the telecom industry with instant messengers over the voice channel with the client.
Legal practice
On the horizon of 3-5 years, we can expect widespread adoption of technologies for automatic verification of contracts and, more broadly, automation of contractual work, including verification of compliance with obligations, etc.
In the next 5-10 years, we can expect the emergence of models of understanding of legal texts. Based on them, we expect the emergence of systems that issue a user’s question posed in a natural language, an answer that is a concise summary of the existing regulatory documentation, including inconsistencies and various versions.
A computer for a lawyer will cease to be a reference and will become a full-fledged decision support tool. One of the main tasks of a lawyer's computer will be predicting the outcome of the lawsuit with the construction of a probabilistic decision tree based on existing practice. Most of this work will probably take place in the cloud on trained models of enormous size.
The massive emergence of point services, products and companies that solve a specific problem in the legal plane.
We can expect a deeper integration of RPA solutions with NLP technologies, which will lead to the transfer of routine tasks for processing information and entering data to software robots.
And finally, the prospect of smart contracts on the blockchain, automatically generated on the basis of the analysis of legally binding documents, such as contracts or NDAs, looks absolutely bewitching. Such a combination of technologies can bring to life self-executing legal documents, which in itself so far sounds like science fiction, but not far from implementation.
Media and Advertising
We are waiting for the widespread introduction of personalized marketing based on online analysis of the digital footprint of a person. It will include a deep analysis of human texts and their tonality: a negative, positive assessment of the text is not in general, but in relation to a specific product or brand.
Each person will have a personal shopping assistant who will take up to 90% of the routine purchases.
There will be services for automatic news generation for a particular company, based on its history, internal and external events.
Science and education
In the next 5-10 years, we can expect the emergence of models for understanding scientific texts. We expect the emergence of systems that provide an answer to a user's question posed in a natural language, which is a short summary of the existing scientific literature on this issue, including contradictions found and various versions. Another application of such models is recommendation systems for research or patent landscape analysis.
Such systems will fundamentally change the technological landscape and accelerate the transfer of technologies by analyzing and identifying experts and expert communities in a given area based on an analysis of the sources of scientific and patent information.
Also on the horizon of 5-10 years, we expect the emergence of full-fledged Teacher Assistents for each discipline and, in general, for educational institutions. On the other hand, the student’s personal assistants will appear who will lead the person along the personal path of education throughout life. The interaction of these intelligent agents is also likely to be in natural language.
State and Security
States are increasingly moving their activities into the media space and social networks. The concept of “information wars”, which emerged in recent years, has taken on completely concrete forms and requires new types of “weapons” and “protection”. A powerful trend is already being observed and the demand for fake news detection will only grow. Unfortunately, one can also confidently predict the growth in demand for automated generation of various kinds of fake news. The use of AI will develop both for creating bots in social networks and for identifying them.
No less important is intelligence. AI will be increasingly used to analyze large amounts of information about companies, people, and transactions in various forms to solve applied problems such as finding affiliations and implicit relationships between companies and individuals.
With the increase in the number of people, the task of automating communication with a citizen in order to provide him with certain services becomes more and more urgent for the state. AI, probably in the form of intelligent agents, will be actively used to personify and personalize state and municipal services for each citizen - the so-called “cognitive cities” and “state-as-service”.
The full Almanac “Artificial Intelligence” on NLP and speech recognition / synthesis can be downloaded here.