While bots are unable to talk like humans do. But Facebook AI researchers are already actively penetrating this area; this can seriously affect the messengers of the company, and not only
Chatbots were a very popular topic in 2015. One of the most popular was M from Facebook, which, according to the company's plans, was supposed to become a flexible general-purpose bot, capable of a lot of things - ordering goods, delivering gifts, reserving tables in a restaurant and planning travel. However, the hype was too loud for the result. When Facebook tested its M for 2500 people from the San Francisco Bay Area, the program did not cope with most of the tasks assigned to it.
After the first surge of enthusiasm towards M and other chatbots (Microsoft director Satya Nadella generally
stated that “chatbots are new applications”), a wave of
disappointment followed. Chatbots chatted poorly, and gravitated towards robots. This is because they were taught to talk on very narrow topics and perform very specific tasks. They were not able to maintain a natural conversation with people, to give answers based on an understanding of words and their meaning. They could only issue general remarks.
Even before M entered the beta testing phase, Facebook reduced its grandiose plans for this bot, although partially natural language technology fell into less ambitious Facebook Messenger chatbots, capable of performing simple single tasks such as accepting an order for food or giving an answer from the list questions and answers. Companies such as American Express and 1-800-FLOWERS still use similar simple chatbots to answer user questions to support, accept simple orders and issue personal account balance information. Many will still switch you to a person if you ask a question outside their limited competence.
However, the AI AI research team has already gone ahead of projects such as simple chatbots. “For the past three to four years, we have been saying that we are not going to follow the path of studying dialogs aimed at achieving a certain goal - this is too difficult a task with too high stakes,” Antoine Borde, a natural language researcher from Facebook, told me. If a travel chatbot “reserves the wrong plane, the wrong flight, it will be a very big mistake in terms of money, travel, etc.,” he says.
Instead of focusing on the mechanics of certain tasks, Borde says, Facebook is taking a step back to a deeper task - to train virtual agents to communicate like people. If chatbots can better understand and talk with people, then, as conceived by the company, they will eventually become the best assistants who can help people perform practical tasks, such as reserving all the same tickets.
Facebook is actively investing in these developments, hiring the best natural language AI experts. The company likes to point out that, unlike other technology giants, it makes AI research results available online to the entire research community, hoping this will help other people creating the next generation AI. But these studies, of course, will fall into their own products.
Messengers, like Messenger and WhatsApp (Facebook still does not understand how to monetize the latter) seem to be a natural area of application for these developments. Zuckerberg talks about the company's ideas to concentrate on private communication, so Messenger and WhatsApp will have to add new features in order not to give primacy to other similar platforms, in particular, WeChat, Telegram and Apple iMessage.
Creating an algorithm that can support a free conversation with a person has become a key goal of technology companies. Amazon, Google and Microsoft are joining Facebook in their betting on the possibility of human communication - and not only through text messengers, but also with the help of voice assistants, and in other ways. Thanks to recent research, the path to creating a computer that is truly capable of communication has suddenly become clearer - however, the medal for first place is still waiting for its winner.
In other words, Facebook’s natural language research goes far beyond simply resurrecting M or improving chatbots in Messenger. It is connected with the future of the whole company.
Introducing the Neural Network
Creating a digital agent that is capable of conducting a credible conversation with a person is probably the most difficult of all tasks in the field of natural language processing. The machine must learn a dictionary full of words, with their examples of use and nuances, and then use them in live communication with an unpredictable person.
Only in the last few years, the natural language AI community has begun to take big steps toward creating a general bot. In particular, this happened due to breakthroughs in the field of neural networks - machine learning algorithms that recognize patterns through the analysis of a huge amount of data.
For most of the history of AI, people have been watching the program while it follows the machine learning process. In a technology called “learning with a teacher”, a person slowly trains a neural network, giving the correct answers to problems, and then adjusting the algorithm so that it reaches the same solution.
Teaching with a teacher works well if there is a large amount of meticulously marked out data - for example, photographs of cats, dogs or other objects. However, this approach often does not work in the chatbot world. It is difficult to find a large number (thousands of hours) of labeled person-to-person conversations, and creating such a volume of data from one company will be very expensive.
Since it is difficult to teach chatbots to talk using old methods, researchers are looking for alternatives to learning with a teacher so that neural networks can learn on the basis of data on their own, without human intervention.
One way to remove the need for training data is to train the machine in common sense at a basic level. If a computer can understand the world around it - for example, the relative size of objects, then how people use them, certain concepts about the effect of the laws of physics on them - it will probably be able to narrow down the variety of options, leaving only the real ones possible.
People do this in a natural way. For example, let's say you drive near a steep cliff, and suddenly you see a large boulder on the road. You need to avoid a collision with him. But choosing options, you are unlikely to decide to sharply turn towards the cliff. You know that a car will fall on stones due to gravity.
Yan Lekun
“Much of human learning is about observing the world around us,” said
Jan Lekun , vice president and chief AI specialist at Facebook, an AI legend working on the most difficult issues since the 1980s. “We learn a lot of things from parents and other people, but just from interacting with the world, trying to do something, failing and adjusting our behavior.”
AI trained using such a technology called “Teacherless Learning” works in a similar way. For example, a robomobile collects data about the world through many sensors and cameras, like a child studying the world with the help of five senses. With this approach, scientists provide the machine with a large amount of training data. They do not ask her to give the correct answer and do not push her to a specific goal. They ask her only to process the data and learn from it, find patterns, build relationships between various points in the data.
In many cases, the necessary data is difficult to find. However, there is such an area of AI in which a neural network can learn a lot about the world without any sensors: natural language processing. Researchers can use the vast amounts of existing texts to help algorithms understand the human world, which is a necessary part of the task of understanding a language.
Suppose a neural network received the following phrases for consideration:
The prize did not fit into the suitcase because it is too big.
The prize did not fit into the suitcase because it is too small.
To understand that in each of the sentences the word “he” refers to different objects, models need to understand the properties of real-world objects and their relationships. “The text on which they are trained contains enough structure to understand that if you have one object that fits in another, then one of them may not fit in the other if it is too large,” says Lekun.
This technique could be the key to a new generation of more useful and sociable Facebook chatbots.
Meet BERT and RoBERTa
Ongoing breakthroughs in teacherless learning for natural language processing systems began at Google in 2018. The company's researchers created the BERT deep learning model (the presentation of transformers with bidirectional coding), and gave it an unallocated text of 11038 books and 2.5 billion words from the English-language Wikipedia. Researchers randomly removed certain words from texts, and set the model to insert a missing word.
After analyzing the entire text, the neural network found patterns from words and sentences, often appearing in the same context, which helped her understand the basic relationships between words. Since words are representations of objects or concepts of the real world, the model learned more than just the linguistic relationships between words: it began to understand how objects are related to each other.
BERT was not the first model to use teacherless learning to understand human language. But she was the first to learn the meaning of the word in its context.
“I would say that this project is in the top two of the breakthroughs in the field of natural language processing,” said Jianfeng Gao, research manager at the Deep Learning Group, one of Microsoft Research's laboratories. “People use this model as a basic level to create all other natural language processing models.” So far, BERT research has been cited in other works more than 1000 times - other researchers are developing on its basis.
Among them is Lekun with his team. They created their own version of this model, carried out optimization, expanded the amount of training data and training time. After billions of calculations, a Facebook neural network called RoBERTa performed much better than Google. She showed an accuracy level of 88.5%, and BERT - only 80.5%.
BERT and RoBERTa represent a radically new approach to teaching computers to communicate. “In the process, the system should indicate the meaning of the words it meets, the structure of sentences, the context,” says Lekun. “In the end, she seems to recognize the meaning of the language, which is rather strange, since she knows nothing about the physical reality of the world.” She has no vision, she has no hearing, she has nothing. " All she knows is language; letters, words and sentences.
Approaching a real conversation
Lekun says that the natural language model trained using BERT or RoBERTa will not develop any meaningful common sense - it will be enough only to provide answers in the chat, based on an extensive database of generalized knowledge. This is just the beginning of the learning process of the algorithm to speak like a person.
Facebook's natural language researchers are also trying to build in more details of RoBERTa-based communication. They began by studying the conversations of people with chatbots in order to understand when a conversation can become boring or break down. Their discoveries help to find ways to train the bot so that it can avoid the most common mistakes in conversation.
For example, chatbots often contradict themselves because they don’t remember what they said earlier. A chatbot may say that she loves to watch the Knight Rider episodes, and then declare that he does not like the show. Chatbots that create their own answers (rather than extracting cues from the training data) often answer questions vaguely so as not to make mistakes. They often seem unemotional, so communicating with them is not so interesting.
Chatbots should also be able to use knowledge to make it interesting to talk to. A bot that can use a wide range of information is more likely to be able to maintain long dialogs with people. However, existing chatbots are trained using knowledge from a single area that corresponds to the task assigned to the bot. This becomes a problem when a person begins to say something about topics that go beyond the competence of the bot. Ask the pizza ordering bot about anything other than pizza, and the conversation will quickly fade.
To cope with this, Facebook researchers are working to train natural language processing models to extract data from many areas of knowledge, and to naturally embed this information in a conversation. Future research will focus on teaching bots how and when to transfer conversation from general things to a specific task.
One of the biggest challenges in developing a chatbot is how to make them learn further after they get started. The meaning of words can change over time, new terms and jargon can become culturally important. At the same time, the chatbot should not be too suggestible - Microsoft's Tay bot learned too much too much in online conversations and turned into a rude racist in 24 hours. Facebook teaches experimental chatbots to learn from good conversations and analyze the language of the person they are talking to to see if the bot said something dull or dumb.
It is difficult to predict exactly when Facebook breakthroughs in the laboratory will help create chatbots that can conduct conversations that are at least a little like human ones. It may not be that long before you can evaluate these results yourself. “We believe that we are very close to creating a bot that can talk to people so that they see value in it,” Facebook researcher Jason Weston told me.