Reverse engineering Yandex.Station activation protocol





Yandex.Station is a smart speaker with voice assistant Alice. To activate it, you need to bring the phone and play the sound from the Yandex application. Under the cut, I will tell you how this signal works, about the password for WiFi in open form and try to develop the idea of ​​transmitting data through sound.



Preamble



I graduated from the radio engineering faculty of MIPT, studied and developed communication systems from physical layer protocols to federal networks. Therefore, when my friends presented Yandex.Station, I immediately wondered how the data transfer for activation through the audio token was organized.



Activation process



When you turn on a new column, somehow you need to transfer information to it for connecting to a WiFi network and authorization in Yandex services. In the Station, this is done through sound and looks like the video below (7:34).







"... information is being transmitted, well, not by sound, of course ..." - says Valentin. He would have known that at that moment his WiFi password got into the video in almost open form! But more on that later.



In the meantime, consider what is happening. The phone takes data on the WiFi network (ssid from the system, the user enters the password) and data for authorization in Yandex. They are somehow encoded, modulated and emitted by the speaker of the telephone. The station demodulates the signal from the microphones, decodes the data and uses it to connect to the network and authorization.



In this process, we are interested in how data is encoded and modulated .



Visual demodulation



To obtain a signal sample, the Station itself is not needed. It is only necessary that the phone is connected to WiFi with internet. I decided to create access points with different ssid and passwords to see how the signal changes. For convenience, I began to record sound in files and work with them.



To start, I created an access point with a random password “012345678” and connected a phone to it. I clicked “Play Sound” and recorded the resulting signal. Let's look at its spectrum over time (waterfall). Here, the vertical axis represents frequency, the horizontal axis represents time, and the color is determined by the amplitude.





So, it can be seen that frequency modulation is used, and the data is transmitted in 40 ms characters. You can also highlight the increasing subsequence:







Stop! It seems we had a growing password. "012345678". How do these numbers look in ascii or utf-8 encodings? " 30 31 32 33 34 35 36 37 38. " Wow! I didn’t even have to change the password! Here it is:







I tried to change the password and made sure that I correctly determined its position in the signal.



It turns out that the data is encoded in 4-bit characters. In fact, a hex string is encoded, where each value 0 - F has its own frequency from 1 kHz to 4.6 kHz with a step of 240 Hz. At the same time, at the beginning and at the end of the transmission there is radiation at frequencies above 5 kHz - the start and end marks are separated from the main part at the physical level.



Decoding



In order not to rewrite symbols on a piece of paper every time I look at the spectrum, I sketched a simple python receiver that converts the audio file stably enough into the original hex string. I began to change the ssid of the access point and analyze what bytes this affects. As a result, it turned out that information about ssid is stored in two bytes before the password. Moreover, the length of this block does not depend on the length of ssid. How so?



Probably only a ssid hash is transmitted to the Station. In this case, most likely, the Station after switching on calculates the hashes of the names of all available networks. Then he selects the network, comparing the received values ​​with the accepted ones. Most likely, this was done to reduce the length of the package. (But how then is the connection to hidden networks?)



It is also seen that with some period there are inserts of 4 characters. On the spectrum they can be seen twice inside the password. This is probably some kind of noise-resistant coding, or synchronization symbols.







I was unable to allocate data for authorization in Yandex. However, the packet length is quite small, so we can say for sure that there is no OAuth token in the package. I assume that the Yandex application receives a temporary link, part of which is transmitted to the Station. And she, in turn, receives through this link complete data for authentication. I think this is also done to reduce the length of the package.



Posted in Yandex?



Yes, May 8, 2019. Received an automatic response:







4 months passed - and did not contact. According to the rules of Yandex, now you can disclose information, which I, in fact, do.



Is this a problem at all?



Perhaps employees of Yandex do not consider this a problem. Indeed, this can hardly be called a vulnerability, because the activation of the Station rarely occurs more than 1 time. Moreover, often she is in a "trusted" room. At home or in the office, you can pronounce the password for WiFi out loud, which is almost the same. Information Security Specialists, what do you think?



In addition, the activation algorithm has already been wired into the produced Stations, so it is unlikely to be able to get rid of this vulnerability in the current version.



However, I believe this is not a reason to ignore messages in bugbounty. At the very least, it’s not polite to promise to answer and not do it. Okay, let’s assume that my appeal is lost somewhere. If anything, ticket number: 19050804473488035.



Personally, I believe that there is some kind of vulnerability. Therefore, despite the fact that I have a stable receiver of this signal, I can not give it to you.









And also, I want to remind Wylsacom , Rozetked , and other bloggers about the need for regular password changes. At least I know what you had at the time of the Yandex.Station review)



What is the result?



Developers from Yandex did a cool thing. They decorated the activation process of the Station, made it unusual. The only problem, in my opinion, is an open password.



But the same process could be made safer using Bluetooth. This made me think that in this case, security or speed is not so important. Important show. Activation through sounds reminiscent of R2-D2 from Star Wars is impressive and looks unusual.



This position inspired me to develop the idea of ​​Yandex developers and make an impression-oriented protocol. What if note frequencies are used to modulate hex characters? Why not transfer the data in C major? It turned out very interesting, but more on that in the next article .



Thank you for reading, success!



UPD: Yandex Response
From the information security service, they replied in the comments , and also sent a letter:






All Articles