About six months ago, I decided to create a program for controlling a computer through a browser. I started with a simple single-socket HTTP server, which transferred pictures to the browser and received the coordinates of the cursor for control.
At a certain stage, I realized that WebRTC technology is well suited for these purposes. Chrome browser has such a solution, it is installed through the extension. But I wanted to make a lightweight program that would work without installation.
At first I tried to use the library provided by Google, but after compilation it takes 500MB. I had to implement the entire WebRTC stack almost from scratch, I managed to fit everything into an exe file with a size of 2.5MB. A friend helped with the JS interface, that's what ended up in the end.
Run the program:
We open the link in the browser tab and get full access to the desktop:
A small animation of the connection setup process:
Supported by Chrome, Firefox, Safari, Opera.
There is the possibility of transmitting sound, audio call, clipboard management, file transfer and hotkeys.
In the course of work on the program, I had to study about a dozen RFCs and understand that the Internet does not have enough information about the operation of the WebRTC protocol. I want to write an article on the technologies that are used in it, I want to know what issues of the following interest the community:
- SDP Streaming Description Protocol
- ICE candidates and establishing a connection between two points, STUN and TURN server
- DTLS connection and key transfer to RTP session
- RTP and RTCP protocols with encryption for media transmission
- Transfer H264, VP8, and Opus over RTP
- SCTP binary data connection