General principles of operation of QEMU-KVM

image



My current understanding:



1) KVM



KVM (Kernel-based Virtual Machine) - a hypervisor (VMM - Virtual Machine Manager), operating as a module on the Linux OS. A hypervisor is needed in order to run some software in a non-existent (virtual) environment and at the same time, hide from this software the real physical hardware on which this software works. The hypervisor acts as a β€œstrip” between the physical hardware (host) and the virtual OS (guest).



Since KVM is a standard module of the Linux kernel, it receives from the kernel all the required nishtyaki (working with memory, scheduler, etc.). And accordingly, ultimately, all these advantages go to the guests (as guests work on a hypervisor that runs on / in the Linux kernel).



KVM is very fast, but it alone is not enough to run a virtual OS, because this requires I / O emulation. For I / O (processor, drives, network, video, PCI, USB, serial ports, etc.) KVM uses QEMU.



2) QEMU



QEMU (Quick Emulator) - an emulator of various devices that allows you to run operating systems designed for one architecture on another (for example, ARM -> x86). In addition to the processor, QEMU emulates various peripheral devices: network cards, HDD, video cards, PCI, USB, etc.



It works like this:



Instructions / binary code (for example, ARM) are converted to an intermediate platform-independent code using the TCG (Tiny Code Generator) converter, and then this platform-independent binary code is converted to target instructions / code (for example, x86).



ARM -> middleware -> x86



In fact, you can run virtual machines on QEMU on any host, even with older processor models that do not support Intel VT-x (Intel Virtualization Technology) / AMD SVM (AMD Secure Virtual Machine). However, in this case, it will work very slowly, due to the fact that the binary must be recompiled on the fly twice using TCG (TCG is Just-in-Time compiler).



Those. QEMU itself is mega cool, but it works very slowly.



3) Protection rings



image



The binary program code on the processors does not work just like that, but is located at different levels (Protection rings) with different levels of access to data, from the most privileged (Ring 0), to the most limited, over-regulated and "with tightened nuts" (Ring 3 )



The operating system (OS kernel) runs on Ring 0 (kernel mode) and can do anything with any data and devices. User applications operate at the Ring 3 (user mode) level and do not have the right to do whatever they want, but instead must each time request access to carry out an operation (thus, user applications have access only to their own data and cannot "Get into" someone else's sandbox "). Ring 1 and 2 are for use by drivers.



Before the invention of Intel VT-x / AMD SVM, hypervisors worked on Ring 0, and guests worked on Ring 1. Since Ring 1 does not have enough rights for the OS to function normally, then with every privileged call from the guest system, the hypervisor had to modify this call on the fly and execute it on Ring 0 (something like QEMU does). Those. The guest binary code was NOT executed directly on the processor, and each time on the fly there were several intermediate modifications.



The overhead was significant and this was a big problem, and then the processor manufacturers, independently of each other, released an extended set of instructions (Intel VT-x / AMD SVM) that allow executing the guest OS code DIRECTLY on the host processor (bypassing any costly intermediate steps, like that was earlier).



With the advent of Intel VT-x / AMD SVM, a special new Ring -1 level was created (minus one). And now the hypervisor is working on it, and the guests are working on Ring 0 and get privileged access to the CPU.



Those. eventually:





4) QEMU-KVM



KVM provides guests with access to Ring 0 and uses QEMU to emulate I / O (processor, disks, network, video, PCI, USB, serial ports, etc. that the guests β€œsee” and work with).



Hence the QEMU-KVM (or KVM-QEMU) :)



CREDITS

A picture to attract attention

Picture Protection rings



PS The text of this article was originally published in the Telegram channel @RU_Voip as an answer to the question of one of the channel participants.



Write in the comments in which places I do not understand the topic correctly or if there is something to supplement.



Thank!



All Articles