🤴🏼 ✨ 👯 General Theory and Archeology of Virtualization x86 ✉️ 👩🏿‍🤝‍👨🏽 🧔🏼

Introduction

Team of authors

Posted by Anton Zhbankov ( AntonVirtual , cloudarchitect.cc )

Co-authors: Grigory Pryalukhin , Evgeny Parfenov

General Virtualization Concepts

I had to see a lot of interpretations of what virtualization is and listen to a lot of controversy, not a bit closer to arguing the practical result. And as you know, the argument of two smart people comes down to a debate about definitions. Let's define what virtualization is and what comes from it.

Probably the closest definition of virtualization will be “abstracting” from object-oriented programming. Or, if translated into normal Russian, this is hiding the implementation behind an abstract interface. Which, of course, explained everything at once. Let's try again, but for those who have not studied programming.

Virtualization - hiding a specific implementation behind a universal standardized method of accessing resources / data.

If you try to put this definition into practice, it turns out that it works on completely unexpected subjects. Let's say the clock. So, a sundial was invented several thousand years ago, and in the Middle Ages a mechanical one was invented. What is there in common? The sun and some gears? This is some nonsense. And then quartz oscillators and everything else.

The bottom line is that we have a standard interface - a pointer or digital pointer, which in a universal standard form indicates the current time. But does it matter to us how specifically this mechanism is implemented inside the box, if the time is indicated with sufficient accuracy for us?

“Let me,” you can say, “but I thought that virtualization was about machines, processors there, and so on!

Yes, it is about cars and processors, but this is only a special case. Let's look more broadly, since the article boldly claims to a general theory.

POZOR!

Uwaga! Achtung! Pozor!

This article has a general educational purpose for linking a whole bunch of technologies and scary words together with history into a certain structure, and due to this circumstance contains a significant amount of intentional simplifications. Of course, it also contains a large number of annoying omissions, and even corny simply errors with typos. Constructive criticism is only welcome, especially in the form of "Let me bring you this part to mind."

Types of Virtualization

Let us return from completely abstract concepts to the more familiar to our beloved computers.

Storage Virtualization

The first, probably, is the type of virtualization that a novice geek encounters - virtualization of a data storage system. In this case, the storage system is used not in the sense of a large array with disks that is connected via fiber channel, but as a logical subsystem responsible for long-term data storage.

FS -> LBA -> CHS

Take the simplest case of a storage system on a single hard magnetic disk. The usual format for working with data is the files that are on the logical drive. The file can be opened, read, closed. But such an object as a file simply does not physically exist - there is only a way to access certain data blocks using the addressing method of the form “drive: \ folder1 \ folder2 \ file”. Those. we meet the first layer of virtualization - from everything mnemonic and understandable to humans, we translate everything into system-understandable addresses. In the metadata tables, the file system driver looks for what kind of data blocks are there, and we get the address in the logical block addressing (LBA) system. In the LBA system, blocks have a fixed size and follow each other linearly, i.e. somehow it may have to do with storing data on magnetic tape, but the hard drive is somehow completely different! And here we go to the second layer of virtualization - translation of LBA addressing to CHS (cylinder / head / sector).

CHS, in turn, already in the hard disk controller begins to translate into physical parameters for reading, but this is a completely different story.

Even in a simple access to the file for, say, viewing a vidosik with memasics, we met with three layers of virtualization right away.

Everything would be too simple if the layers did not begin to overlap in random order and in a variety of ways.

RAID

The next layer of virtualization, which many people mistakenly do not consider virtualization, is RAID (redundant array of inexpensive / independent disks).

The key feature of RAID in the context of the concepts discussed is not its ability to protect data from the failure of a particular physical disk. RAID provides a second level of LBA addressing on top of several (sometimes very many) independent LBA addresses. Since we can access the RAID, regardless of the RAID level, in exactly the same way as a single disk without RAID, we can say with confidence:

RAID is disk virtualization.

Moreover, the RAID controller does not just create one large virtual disk from several physical disks, but can create an arbitrary number of them by adding another layer of virtualization.

View virtualization

The next type of virtualization, which many of us use almost every day, but do not consider it virtualization, is a remote connection to the desktop.

Terminal servers, VDI, and even just RDP via VPN to the server are all session virtualization. Through a standard interface (monitor, keyboard, mouse) we work either with a real machine, or with an incomprehensible design from a virtual desktop on a linked clone with a containerized application, from which we transfer data through a buffer to an application with streaming delivery. Or not, who will figure it out, besides the one who designed it?

Introduction to x86 Virtualization

History and overview of the processors

Program execution

At the first lesson in a special programming course, Vladimir Denisovich Lelyukh (rest in peace for him) told students: the computer, in spite of its name, cannot count, it can pretend that it can count. But if something looks like a duck, walks like a duck and quacks like a duck, from a practical point of view it is a duck.

Let's try to remember this for further practical use.

The computer, and specifically the processor, actually does nothing - it just expects some input parameters in certain places, and then, through terrible black magic, gives some results in certain places.

A program in this case is a certain stream of commands executed strictly sequentially, as a result of which we expect to see a certain result.

But if the program is executing, then how can the data be entered at all? And in general, somehow interact on a computer?

For this, hardware interrupts were invented. The user presses a key - the keyboard controller signals this, and there is an interruption in the execution of the current code thread. The addresses of interrupt handlers are recorded in a specific memory area, and after saving the current state, control is transferred to the interrupt handler. In turn, the handler should, in theory, quickly process everything, then he and the handler, write down the key pressed in the desired buffer, and return control back. Thus, the application seems to be running, and we can interact with the system.

Interrupt handlers (and the main type of handlers are device drivers) have the ability to enter a special processor mode, when other interrupts cannot be implemented before exiting this mode. Which in the end often led to a hangup problem - an error in the driver did not allow to exit the interruption.

Multitasking

What to do in a situation if it is necessary to execute several programs (code streams with their data and memory structures) at the same time? Obviously, if there are more code streams than devices capable of executing them, then this is a problem.

Pseudo-multitasking appears when a task is executed when switching directly to it.

In the future, a cooperative (non-preemptive multitasking) appears - the executable task itself understands that it no longer needs processor resources and it gives control to someone else. But all this is not enough.

And here again interruptions + ability to pretend come to the rescue of us. It doesn’t really matter to the user that they be executed strictly simultaneously; it’s enough to look like that.

Therefore, a handler is simply hung up to interrupt the timer, which begins to control which code stream should be executed next. If the timer is triggered quite often (say 15ms), then for the user everything looks like a parallel operation. And so there is a modern crowding out multitasking.

Real mode

The real processor mode in the framework of this article can be described quite simply - all memory is available to everyone. Any application, including malware (malware, malicious software), can access anywhere, both for reading and writing.

This is the initial mode of operation of the Intel x86 family of processors.

Protected mode

In 1982, the Intel 80286 processor (hereinafter simply referred to as 286) introduced an innovation - a protected mode of operation, which brought with it innovations in the organization of work with memory (for example, the allocation of types of memory segments - code, data, stack). But the most important thing that the 286 processor brought to the x86 world is the concept of protection rings, which we still use.

The concept of protection rings originally appeared in the Multics OS for the GE645 mainframe (1967) with a partially software implementation, and fully hardware already in 1970 in the Honeywell 6180 system.

The main idea of the rings of defense resembles multi-level medieval fortresses, the most valuable lies in the very center behind the multiple walls. In this case, the most valuable thing is unlimited direct access to any area of RAM and control over all processes. They are possessed by processes working in the zero ring of protection. Behind the wall, in the first ring, less important processes work, such as device drivers, and in the very last, user applications. The principle is simple - from the inside you can go outside, but from the outside inward is prohibited. Those. no user process can access the OS kernel memory, as was possible in real mode earlier.

In the very first full implementation of the Honeywell 6180, 8 protection rings were implemented, but Intel decided to simplify the circuit to 4, of which, in practice, OS manufacturers began to use only two - zero and third.

32bit

In 1985, another extremely architecturally important processor in the x86 line was released - 80386 (hereinafter 386), which implemented 32-bit memory addressing and used 32-bit instructions. And of course, memory virtualization. As already mentioned, virtualization is the concealment of actual implementation through the provision of artificial “virtual” resources. In this case, we are talking about memory addressing. The memory segment has its own addressing, which has nothing to do with the actual location of the memory cells.

The processor turned out to be so in demand that it was produced before 2007.

The architecture in terms of Intel is called IA32.

64bit

Of course, even without virtualization in the mid-2000s, the industry was already running into the limits of 32 bits. There were partial workarounds in the form of PAE (Physical Address Extension), but they complicated and slowed down the code. The transition to 64 bits was a foregone conclusion.

AMD introduced its version of the architecture, which is called AMD64. At Intel, they hoped for the IA64 platform (Intel Architecture 64), which we also know by the name Itanium. However, the market met this architecture without much enthusiasm, and as a result, Intel were forced to implement their own support for AMD64 instructions, which was first called EM64T, and then just Intel 64.

Ultimately, we all know this architecture as AMD64, x86-64, x86_64, or sometimes x64.

Since the main use of servers at that time was supposed to be physical, without virtualization, a technical funny thing happened with the first 64-bit processors in virtualization. Nested hypervisors were often used as laboratory servers; not everyone could afford several clusters of physical servers. And in the end, it turned out that the load VM in the embedded hypervisor could only work in 32bit mode.

In the first x86-64 processors, developers, while maintaining full compatibility with the 32-bit operating mode, threw out a significant part of the functionality in 64-bit mode. In this case, the problem was to greatly simplify the memory segmentation. The ability to guarantee the integrity of a small piece of memory in the VM where the hypervisor exception handler worked was removed. Accordingly, the guest OS was able to modify it.

Subsequently, AMD returned the possibility of limiting segments, and Intel simply waited for the introduction of hardware virtualization.

UMA

X86 multiprocessor systems began working with UMA (Uniform Memory Access) mode, in which the distance from any processor (delay in accessing a memory location) to any memory bar is the same. In Intel processors, this scheme of work was preserved even after the appearance of multi-core processors up to the 54xx generation (Harpertown). Starting with the 55xx (Nehalem) generation, processors have switched to NUMA architecture.

From the point of view of execution logic, this is the appearance of additional hardware streams to which you can assign code streams for execution in parallel.

NUMA

NUMA (Non Uniform Memory Access) - architecture with uneven access to memory. Within this architecture, each processor has its own local memory, access to which is carried out directly with low latency. The memory of other processors is accessed indirectly with higher latencies, which leads to reduced performance.

For 2019 Intel Xeon Scalable v2 processors, the internal architecture still remains UMA within the socket, turning into NUMA for other sockets (although not really, and it only pretends to be). AMD's Opteron processors had NUMA architecture even in the days of the oldest UMA Xeon, and then NUMA became even inside the socket until the last generation of Rome, in which they returned to NUMA = socket.

Virtual machine

A virtual machine (VM, from the English virtual machine) is a software and / or hardware system that emulates the hardware of a certain platform (target is a target or guest platform) and executes programs for a target platform on a host platform (host is a host platform , the host platform), or virtualizing some platform and creating environments on it that isolate programs and even operating systems from each other. Wikipedia

In this article we will say “virtual machine”, meaning “system virtual machines”, allowing to completely simulate all resources and hardware in the form of software constructs.

There are two main types of software for creating virtual machines - with full and resp. incomplete virtualization.

Full virtualization is an approach in which all hardware, including the processor, is emulated. Allows you to create hardware-independent environments, and run for example the OS and application software for the x86 platform on SPARC systems, or the well-known Spectrum emulators with the Z80 processor on the familiar x86. The flip side of complete independence is the high overhead for virtualizing the processor and low overall performance.

Incomplete virtualization is an approach in which not 100% of hardware is virtualized. Since incomplete virtualization is the most common in the industry, we will talk about it. About platforms and technologies of system virtual machines with incomplete virtualization for x86 architecture. In this case, there is incomplete virtualization of the processor, i.e. with the exception of partial substitution or hiding of certain system calls, the binary code of the virtual machine is executed directly by the processor.

Software virtualization

The obvious consequence of the processor architecture and the habits of operating systems to work in the zero ring was the problem - the guest OS kernel cannot work in the usual place. The zero ring is occupied by the hypervisor, and you just have to let the guest OS get there too - on the one hand, we returned to real mode with all the consequences, and on the other, the guest OS does not expect anyone there, and will instantly destroy all data structures and drop the car.

But everything was decided quite simply: since for the hypervisor the guest OS is just a set of memory pages with full direct access, and the virtual processor is just a queue of commands, why not rewrite them? Right on the fly, the hypervisor throws out from the queue of instructions for execution on the virtual processor all instructions that require zero-ring privileges, replacing them with less privileged ones. But the result of these instructions is presented in exactly the same way as if the guest OS was in the zero ring. Thus, you can virtualize anything at all, up to the complete absence of a guest OS.

This approach was implemented by the development team in 1999 in the VMware Workstation product, and then in 2001 in the GSX server hypervisors (the second type, like Workstation) and ESX (the first type).

Paravirtualization

Paravirtualization is a very simple concept, which assumes that the guest OS knows that it is in a virtual machine, and knows how to access the host OS for certain system functions. This eliminates the problem of emulation of the zero ring - the guest OS knows that it is not in the zero and behaves accordingly.

Paravirtualization in x86 appeared in 2003 with the Linux Xen project.

Certain paravirtualized functions are also implemented in hypervisors with full virtualization through special virtual drivers in guest OSs that communicate with the hypervisor to reduce virtualization overhead. For example, VMware ESXi for VMs has a paravirtual SCSI adapter PVSCSI, which improves overall performance for VMs with intensive disk operations, such as loaded DBMSs. Drivers for paravirtual devices come in additional packages (for example VMware Tools), or are already included in Linux distributions (open-vm-tools).

Hardware virtualization

With the development and growth of popularity of virtualization, a desire arose for both platform manufacturers to reduce their support costs, and from a security point of view, to guarantee protection in hardware.

The problem was solved in a very simple way - Intel VT-x and AMD-V proprietary hardware virtualization technologies were added, if we discard deep technical details, minus the first protection ring for the hypervisor. Thus, the situation of work in the zero ring familiar to the OS was finally established.

Types of Hypervisors

Type 2 (hosted)

Hypervisors of the second type are applications that run on top of the host operating system. All virtual machine calls are handled by the upstream host operating system. Hypervisors of the second type are severely limited in performance, because the application of the hypervisor, not having the right to exclusive allocation of computing resources, is forced to compete for them with other user applications. In terms of security, hypervisors of the second type directly depend on the security policies of the user OS and its vulnerability to attacks. Today, there is a unanimous opinion in the industry that such virtualization platforms for the enterprise level are not suitable. However, they are well suited for cross-platform development and deployment of stands directly on the machines of software developers, as they are easy to manage and deploy.

Examples of the second type of hypervisor: VMware Workstation / Fusion, Oracle VM VirtualBox, Parallels Desktop, VMware Server (ex-GSX), Microsoft Virtual Server 2005

Type 1 (bare-metal)

Hypervisors of the first type do not require a general-purpose OS, unlike the previous ones. The hypervisor itself is a monolith that controls both the allocation of computing resources and I / O. The micro-core is located in the zero safety ring, on top of which all control structures work. In this architecture, the hypervisor controls the distribution of computing resources and controls all the virtual machine calls to devices. VMware ESX was considered the first hypervisor of the first type for x86 for a long time, although now we would attribute it to 1+. The only "honest" representative of this type today is VMware ESXi - the successor to ESX, after it was bit off the parent section with RHEL.

For example, consider the ESXi architecture. Hypervisor management commands are executed through the agent API, which runs on top of VMkernel. This may seem like a direct connection to the hypervisor, but it is not. There is no direct access to the hypervisor, which distinguishes this type of hypervisor from the second type of hypervisor in terms of security.

The disadvantage here is the device drivers: to ensure the “thinness” of the platform and eliminate unnecessary complications from version to version, device drivers are rotated, which makes the physical infrastructure dependent on HCL (hardware compatibility list).

Type 1+ (Hybrid Hypervisor)

Hybrid type hypervisors (they are also types 1+, 1a, 1.5) are characterized by isolation of the base OS into a special entity called a parent partition (parent partition in Microsoft Hyper-V terminology) or a parent domain (domain dom0 in Xen terminology). So, after installing the role of the hypervisor, the kernel enters virtualization support mode and the hypervisor is responsible for allocating resources on the host. But the parent section takes on the function of handling calls to device drivers and I / O.

In fact, the parent section becomes a kind of provider between all entities of the virtualization stack. This approach is convenient from the point of view of compatibility with equipment: it is not necessary to embed device drivers in the hypervisor, as is the case with ESXi, which means that the list of devices is expanding much and is less dependent on HCL. The advantages include unloading the hypervisor from the task of processing calls to device drivers, since the parent section handles all calls.

The top-level architecture of type 1+ hypervisors looks like this:

Hypervisors of this type include: the deceased VMware ESX, Microsoft Hyper-V, Xen-based hypervisors (Citrix XenServer and Xen implementations in various Linux distributions). Recall that Citrix XenServer is a slightly truncated RHEL-based OS, and its version and functionality were directly dependent on the current version of Red-Hat Enterprise Linux. In the case of other Xen implementations, the situation is not much different: it is the same Linux kernel in the Xen hypervisor mode and the base OS in the dom0 domain. This leads to the unambiguous conclusion that Xen-based hypervisors are of the hybrid type and are not honest type 1 hypervisors.

Main technologies of industrial platforms

The basis will be taken the terminology of VMware, as the most technologically advanced virtualization platform. In this article, we restrict ourselves to the technologies of the hypervisors themselves and the basic control system. All advanced functionality implemented by additional products for additional money will be left behind the scenes. Technologies are grouped into conditional groups for the main purpose, as it seemed to the author, with whom you have the right to disagree.

SLA

This is a collection of technologies that primarily affect the performance of SLAs for accessibility (RPO / RTO).

HA

High Availability - a technology for ensuring high availability of VMs in a cluster by a hypervisor. In the event of a host death, the VM automatically restarts on the surviving hosts. Effect: minimizing RTO before the timeout of HA + restart OS / services.

FT

Fault Tolerance - technology for ensuring continuous operation of VMs even in case of host death. A shadow VM is created on the second host, which is completely identical to the main one and repeats the instructions behind it. Thus, the difference in VM states is measured in tens or hundreds of milliseconds, which is quite acceptable for many services. When the host dies, the execution automatically switches to shadow VM. Effect: minimizing RTO to zero.

Tco

This is a collection of technologies that primarily influence TCO.

vMotion

vMotion is a technology for live migration of a VM execution point from one fully functional host to another. At the same time, the switching point of the execution point is less than the timeouts of network connections, which allows us to consider the migration as live, i.e. without interruption in the work of productive services. Effect: reducing RTO to zero for planned outages for server maintenance and, as a result, partial elimination of outages themselves.

Storage vMotion

Storage vMotion is a technology for live migration of a VM storage point from one fully functional storage to another. At the same time, work with the disk system does not stop, which allows migration to be considered live. Effect: reducing RTO to zero for planned outages for servicing storage systems and, as a result, partial elimination of outages themselves.

DPM

Distributed Power Management - technology to control the level of host load and power on / off of hosts as the load on the cluster changes. Requires DRS for its operation. Effect: overall reduction in power consumption.

Distributed vSwitch

Distributed vSwitch is a technology for centralized management of network settings of virtual host switches. Effect: reducing the volume and complexity of work on reconfiguring the network subsystem, reducing the risks of errors.

EVC

Enhanced vMotion Compatibility is a technology that allows masking available processor instructions for VMs in automatic mode. It is used to align the work of VMs in an uneven cluster with the oldest processor family, providing the ability to migrate VMs to any host. Effect: saving on infrastructure complexity while gradually increasing capacity / partially upgrading clusters.

QoS

This is a collection of technologies that primarily influence SLA performance in terms of quality of service.

vNUMA

vNUMA is a technology that allows the guest OS to communicate with the VM virtual NUMA topology for wide machines (vCPU or vRAM> NUMA node). Effect: the lack of a penalty on the performance of application software that supports NUMA.

Resource pool

Resource pools - the technology of combining several VMs into a single resource pool to control consumption or guarantee the allocation of resources. Effect: simplify administration, provide a level of service.

Limit / reserve

Limiting and redundant processor / memory allows you to limit the allocation of resources, or vice versa, to guarantee their allocation in a situation of scarcity and competition to ensure the maintenance of high-priority VMs / pools.

DRS

Dynamic Resource Scheduler - automatic balancing of VMs by hosts depending on the load to reduce fragmentation of resources in the cluster and provide a level of service for VMs. Requires vMotion support.

Storage IO Control

Storage IO control is a technology to limit the “noisy neighbor”, a low-priority machine with high disk load to keep the performance of an expensive storage system available for productive load. As an example, an indexing system / internal search engine and a productive DBMS.

Network IO Control

Network IO Control is a technology to limit the “noisy neighbor”, a low-priority machine with high network load.

Storage Integration (VAAI etc)

Two categories of technologies fall into the integration section:

Integration of the virtualization management system with the storage management system can significantly simplify the selection and presentation of volumes / storage balloons to hypervisors, reducing the risk of errors and the complexity of the work.
Protocol Level Integration - VAAI, ODX. These technologies allow you to unload the disk subsystem, transferring part of the standard load to the disposal of intelligent storage. For example, this category includes operations such as zeroing blocks, cloning VMs, and so on. Due to this, the channel to the storage system is significantly unloaded, and the storage system itself conducts disk operations in a more optimal way.

Security

Microsegmentation

Microsegmentation of a virtual network in practical use is the ability to build a virtual distributed firewall that controls virtual networks inside the host. Extremely enhances virtual network security.

Agentless AV

Technology support agentless antiviruses. Instead of being checked by agents in guest OSs, the traffic of VM disk operations is directed by the hypervisor to the selected service VM. Significantly reduces the load on the processors and the disk system, effectively killing the “anti-virus storms”.

Hyper Converged Systems

Converged systems, as the name suggests, are systems with a combination of functions. And in this case, we mean the combination of the storage and execution of the VM. It seems simple, but marketing suddenly breaks in.

For the first time, marketers rush into the market with converged systems. Converged system sold ordinary classic servers + storage + switches. Just under one partner number. Or they weren’t even selling, but a paper called “reference architecture” was produced. We sincerely condemn this approach and move on to architectural consideration.

Architecture

Keeping convergence as an architectural principle, we obtain a combination of the storage point and the execution point of the VM in a single system.

Converged architecture, in other words, implies the use of the same hardware services both for executing VMs and for storing them on local disks. Well, since there should be fault tolerance - in a converged architecture there is a layer of distributed SDS.

We get:

The classical system - software, storage, switching and servers come from different places, combined by the hands of the customer / integrator. Separate support contracts.
Converged system - all from one source, one support, one partner number. Not to be confused with self-assembly from one vendor.

And it turns out that the term for our converged architecture is already taken. Exactly the same situation as with the supervisor.

Hyperconverged System - A converged system with converged architecture.

Of course, it was not without the second coming of marketers. Converged systems appeared in which there was no storage combination, but there are dedicated storage nodes under the control of distributed SDS. In the framework of marketing wars, even the special term disaggregated HCI (disaggregated hypervergenic infrastructure) appeared. In particular, for example, NetApp with a similar system at first quite intensively fought for the right to call its system hyper-convergent, but eventually surrendered. NetApp HCI for today (late 2019) - hybrid cloud infrastructure.

Implementation options

Due to the fact that hyperconverged systems work with virtualization, there are actually two and a half options for implementation.

1. The kernel module. SDS acts as a monolith within the core of the hypervisor, for example vSAN + ESXi
1.5 Parent section module. SDS acts as a service within the parent section of the hypervisor, for example S2D + Hyper-V
2. The virtual machine. SDS is implemented as a dedicated virtual machine on each host. Nutanix, Cisco Hyperflex, HPE Simplivity.

Obviously, in addition to the issues discussed with the effect of embedding on performance, there is a very important issue of isolation and support of third-party hypervisors. In case 1, it is obvious that this can only be a single system from the provider of the hypervisor, while 2 can potentially work in any hypervisor.

Containers

Container virtualization, although technically very different from full virtualization, looks quite simple in structure. As with the OSI network model, the question is level. Container virtualization is a higher level - at the level of the application environment, and not at the physics.

The main task of container virtualization is to divide the OS into independent pieces, from which isolated applications could not interfere with each other. Full virtualization is shared not by the OS, but by a physical server.

VM vs Container

The pros and cons of both approaches are quite simple and directly opposite.

Full virtualization (VM) gives full independence to the level of iron, including fully independent OS, disk and network stacks. On the other hand, each application, because we adhere to the scheme 1 application = 1 server, requires its own OS, its own disk and network stack. those.there is a multiple expenditure of resources.

The containers have common disk and network stacks with the host OS, and all together they use the same core on the entire physical server (well, or virtual, as of late), which in general allows you to quite significantly save resources on homogeneous landscapes.

Historically, x86 initially had containers for everything, along with physical servers. After the advent of full virtualization, the importance of containers fell dramatically by almost 15 years, and thick VMs reigned in the corporate world. At that time, containers found themselves at hosters that provided hundreds of the same type of web servers, where their lightness was in demand. But in recent years, since about 2015, containers have returned to corporate reality in the form of cloud-native applications.

Containers 0.1

chroot

The prototype of containers in 1979 was chroot.

“Chroot is the operation of changing the root directory on Unix-like operating systems. A program launched with a modified root directory will only have access to the files contained in this directory. ”

Those. in fact, isolation is only at the file system level, otherwise it is just a normal process in the OS.

Freebsd jail

Significantly more advanced was the prison from free BSD, which appeared in 1999. Jail allowed creating full virtual OS instances with their own sets of applications and configuration files based on the base FreeBSD. Surely there will be those who say - and what does jail do in containers, because this is paravirtualization! And they will be partially right.

However, before full virtualization (and its variant in the form of paravirtualization), jail lacks the ability to run the kernel of a different version in the guest VM and clustering with the migration of the VM to another host system.

Solaris zones

Solaris Zones is an operating system virtualization technology (container virtualization), introduced in 2004 in Sun Solaris. The basic principle is low virtualization overhead.

Not gaining much popularity, migrated to OpenSolaris and distributions based on it, available in 2019.

Containers 1.0

In the era of containers 1.0, two main directions of containerization have appeared - these are commercial products for hosting providers, and containerization of applications.

Virtuozzo / OpenVZ

Russian SWsoft in 2001 introduced its first version of container virtualization Virtuozzo, aimed at the market of hosting providers. Due to the determination and the specific commercial target audience, the product turned out to be quite successful and gained popularity. Technologically, in 2002, the simultaneous operation of 2500 containers on an 8 processor server was demonstrated.

In 2005, an open version of Virtuozzo containers for Linux called OpenVZ appeared. And almost became the gold standard for hosting VPS.

Lxc

LinuX Containers (LXC) is another well-known container virtualization based on namespaces and cgroups, which appeared in 2008. It underlies the currently popular dockers, etc.

Containers 1.1 (Application Virtualization)

If the remaining containers are designed to divide the base OS into segments, then why not tear off this layer of the system and pack it in a single box with the application and with all its surroundings. And then this ready-made package can be launched as a regular user-level application.

App-v

Microsoft Application Virtualization (App-V), formerly Softricity SoftGrid - technology for containerizing specific applications (the container is the opposite) in an isolated sandbox, then Microsoft. In 2006, Microsoft acquired the Softricity startup, which actually turned the container around.

Thinapp

VMware ThinApp (formerly Thinstall) is an application containerization product from Jilt acquired by VMware in 2008. VMware estimates that 90-95% of all packaged applications in the world use this particular technology.

Containers 2.0

The history of the emergence of containers 2.0 is very associated with a change in the software development process. The desire of the business to reduce such an important parameter as Time-to-market forced developers to reconsider approaches to creating software products. The Waterfall development methodology (long release cycles, the whole application is updated) is replaced by Agile (short, time-fixed release cycles, application components are independently updated) and forces developers to separate monolithic applications into components. While the components of monolithic applications are still quite large and there are not many of them that can be placed in virtual machines, but when one application consists of tens or hundreds of components, virtual machines are no longer very suitable.In addition, the problem of versions of auxiliary software, libraries, and dependencies also arises; there is often a situation where different components require different versions or differently configured environment variables. Such components have to be distributed to different virtual machines, because it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.libraries and dependencies, there is often a situation when different components require different versions or differently configured environment variables. Such components have to be distributed to different virtual machines, because it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.libraries and dependencies, there is often a situation when different components require different versions or differently configured environment variables. Such components have to be distributed to different virtual machines, because it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.when different components require different versions or differently configured environment variables. Such components have to be distributed to different virtual machines, because it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.when different components require different versions or differently configured environment variables. Such components have to be distributed to different virtual machines, because it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.it is almost impossible to simultaneously run multiple versions of software within the same OS. The number of VM begins to grow like an avalanche. Here, containers appear on the stage, allowing within the framework of one guest OS to create several isolated environments for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.allowing to create several isolated environments within one guest OS for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.allowing to create several isolated environments within one guest OS for launching application components. Containerization of applications allows you to continue segmenting a monolithic application into even smaller components and move to the paradigm of one task = one component - a container, this is called a microservice approach, and each such component is a microservice.

Container under the hood

If you look at the container with a glance from the system administrator, then these are just Linux processes that have their own pids, etc. What makes it possible to isolate processes running in containers from each other and jointly consume guest OS resources? Two standard mechanisms present in the kernel of any modern Linux distribution. The first, Linux Namespaces, which ensures that each process sees its own OS representation (file system, network interfaces, hostname, etc.) and the second, Linux Control Groups (cgroups), restricting the process to consuming guest OS resources (CPU, memory network bandwidth, etc.).

Linux Namespaces

By default, every Linux system contains one single namespace. All system resources, such as file systems, process identifiers (Process IDs), user identifiers (User IDs), network interfaces belong to this namespace. But no one is stopping us from creating additional namespaces and redistributing system resources between them.

When a new process starts, it starts in a namespace, system-standard, or one of the created ones. And this process will see only those resources that are available in the namespace used to run it.

But not everything is so simple, each process does not belong to one single namespace, but to one namespace in each of the categories:

Mount (mnt)
Process ID (pid)
Network (net)
Inter-process communication (ipc)
UTS
User ID (user)

Each type of namespace isolates a corresponding resource group. For example, the UTS space defines the hostname and domain name visible to processes. Thus, two processes within the guest OS can assume that they are running on different servers.

Network namespace determines the visibility of network interfaces, the process inside will see only the interfaces belonging to this namespace.

Linux Control Groups (cgroups)

Linux Control Groups (cgroups) is the kernel system mechanism (Kernel) of Linux systems that limits the consumption of system resources by processes. Each process or group of processes will not be able to get more resources (CPU, memory, network bandwidth, etc.) than it is allocated, and will not be able to capture the "other" resources - the resources of neighboring processes.

Docker

As stated above, Docker did not invent containers as such. Containers have existed for many years (including those based on LXC), but Docker made them very popular by creating the first system that made it easy and simple to transfer containers between different machines. Docker has created a tool for creating containers - packaging the application and its dependencies, and running containers on any Linux system with Docker installed.

An important feature of Docker is the portability of not only the application itself and its dependencies between completely different Linux distributions, but also the portability of the environment and the file system. For example, a container created on CentOS can be run on an Ubuntu system. In this case, inside the launched container, the file system will be inherited from CentOS, and the application will consider that it runs on top of CentOS. This is somewhat similar to an OVF image of a virtual machine, but the concept of a Docker image uses layers. This means that when updating only part of the image, there is no need to download the entire image again, it is enough to download only the changed layer, as if the OVF image could update the OS without updating the entire image.

Docker has created an ecosystem for creating, storing, transferring and launching containers. There are three key components to the Docker world:

Images - an image, this is the entity that contains your application, the necessary environment and other metadata needed to launch the container;
Registers - repository, storage place for Docker images. There are a variety of repositories, ranging from the official - hub.docker.com and ending with private ones deployed in the company's infrastructure;
Containers - a container, Linux container created from a Docker image. As mentioned above, this is a Linux process running on a Linux system with Docker installed, isolated from other processes and the OS itself.

Consider the container life cycle. Initially, a developer creates a Docker image with his application (docker build command), completely from scratch or using already created images as a basis (remember about layers). Further, this image can be launched by the developer directly on his own machine or can be transferred to another machine - the server. For portability, repositories are often used (the docker push command) - they load the image into the repository. After that, the image can be downloaded to any other machine or server (docker pull). Finally, create a working container (docker run) from this image.

Kubernetes

As we have already said, the concept of microservices means dividing a monolithic application into many small services, usually performing one single function. It’s good when there are dozens of such services, you can still manage them manually, for example, with Docker. But what to do when there are hundreds and thousands of such services? In addition to the industrial environment, you need a test environment and additional environments for different versions of the product, i.e. multiply by 2, by 3 or even more. Google faced the same problems, its engineers were one of the first to use containers on an industrial scale. So Kubernetes (K8s) was born, created under the name Borg in the walls of Google product, later given to the general public and renamed.

K8s is a system that makes it easy to deploy, manage and monitor containerized applications (microservices). As we already know, any Linux machine is suitable for launching containers and the containers are isolated from each other, respectively, and K8s can manage various servers with different hardware and under the control of various Linux distributions. All this helps us to use the available hardware effectively. Like virtualization, K8s provides us with a common pool of resources for launching, managing and monitoring our microservices.

Since this article is mainly intended for virtualization engineers, for a general understanding of the principles of operation and the main components of K8s, we recommend that you read the article that draws the parallel between K8s and VMware vSphere: https://medium.com/@pryalukhin/kubernetes-introduction-for-vmware- users-232cc2f69c58

X86 Industrial Virtualization History

VMware

VMware appeared in 1998, starting with the development of a second type of hypervisor, which later became known as VMware Workstation.

The company entered the server market in 2001 with two hypervisors - GSX (Ground Storm X, second type) and ESX (Elastic Sky X, first type). Over time, the prospects of the second type in server applications have become obvious, i.e. None. And the paid GSX was first turned into a free VMware Server, and then completely stopped and buried.

In 2003, the Virtual Center central management system, vSMP technology, and live migration of virtual machines appeared.

In 2004, VMware was acquired by EMC, a storage giant, but left operational independent.

In 2008, becoming the de facto standard of the industry, VMware stimulated the rapid growth of competitive offers - Citrix, Microsoft, etc. It becomes clear the need to get a free version of the hypervisor, which was impossible - as a parent section in ESX, a completely commercial RHEL was used. The project to replace RHEL with something easier and free got its implementation in 2008 with the busybox system. The result is ESXi, known to all today.

In parallel, the company is developing through internal projects and acquisitions of startups. A few years ago, a list of VMware products took up a couple of A4 pages, so let's just say. VMware for 2019 is still the de facto standard in the on-premise corporate full virtualization market with a market share of more than 70% and an absolute technology leader, and a detailed review of the history deserves a separate very large article.

Connectix

Founded in 1988, Connectix worked on a variety of system utilities until it took up virtualization. In 1997, the first VirtualPC product for the Apple Macintosh was created, allowing Windows to run in a virtual machine. The first version of VirtualPC for Windows appeared in 2001.

In 2003, Microsoft bought VirtualPC and, by agreement with Connectix, the developers switched to Microsoft. After that, Connectix closed.

The VHD (virtual hard disk) format was developed by Connectix for VirtualPC, and as a reminder, the virtual disks of Hyper-V machines contain “conectix” in their signature.

VIrtual PC, as you might guess, is a classic desktop hypervisor of the second type.

Microsoft

Microsoft's journey into industrial virtualization began with the purchase of Connectix and rebranding of Connectix Virtual PC in Microsoft Virtual PC 2004. Virtual PC developed for a while, was included under the name Windows Virtual PC in Windows 7. In Windows 8 and later, Virtual PC was replaced by desktop version of Hyper-V.

Based on Virtual PC, the server server Virtual Server was created, which existed until the beginning of 2008. Due to the obvious technological loss before VMware ESX, it was decided to curtail the development of the second type of hypervisor in favor of its own first type of hypervisor, which became Hyper-V. There is an unofficial opinion in the industry that Hyper-V is surprisingly similar to Xen in architecture. Approximately the same as .Net in Java.

“Of course, you might think that Microsoft stole the idea of Java.” But this is not true, Microsoft inspired her! - (from a speech by a Microsoft representative at the presentation of Windows 2003 Server)

From the curious moments, it can be noted that inside Microsoft, the use of proprietary virtualization products in the zero years was, to put it mildly, optional. There are screenshots of Technet from articles on virtualization, where the VMware Tools logo is clearly present in the tray. Also, Mark Russinovich at the 2009 Platform in Moscow conducted a demonstration with VMware Workstation.

In an effort to enter new markets, Microsoft created its own public cloud, Azure, using a highly modified Nano Server with Hyper-V, S2D and SDN support as the platform. It is worth noting that initially, Azure at some points lagged far behind on-premise systems. For example, support for second-generation virtual machines (with support for Secure Boot, boot from GPT partitions, PXE boot, etc.) appeared in Azure only in 2018. While in On-Premise, second-generation VMs are known since Windows Server 2012R2. The same goes for portal solutions: until 2017, Azure and the Windows Azure Pack (Multi-Tenancy cloud solution with SDN and Shielded VM support, which replaced System Center App Controller in 2013) used the same portal design. After Microsoft announced a course on public clouds, Azure stepped forward to develop and implement various know-how. Around the year 2016, you can observe a completely logical picture: now all the innovations in Windows Server come from Azure, but not in the opposite direction. The fact of copying parts of documentation from Azure to on-premise “as is” (see the documentation for Azure SDN and Network Controller) indicates this, which on the one hand hints at the attitude to on-premise solutions, and on the other, indicates the relationship of solutions in terms of entities and architecture. Who copied from whom and how it really is - a debatable question.

In March 2018, Satya Nadela (Microsoft CEO) officially announced that the public cloud was becoming the company's priority. Which, obviously, symbolizes the gradual folding and fading of the server line for on-premise products (however, stagnation was observed back in 2016, but was confirmed with the first Windows Server beta and the rest of the on-premise product lines), with the exception of Azure Edge - the minimum required server Infrastructure in the customer’s office for services that cannot be taken to the cloud.

Virtual iron

Founded in 2003, Virtual Iron offered a commercial version of Xen and was one of the first to offer the market full hardware virtualization support.

In 2009, Oracle was absorbed to develop its own line of virtualization Oracle VM and expand it on x86. Prior to this, Oracle VM was only offered on the SPARC platform.

Innotek

In early 2007, Innotek GmbH released the second-type proprietary desktop hypervisor, VirtualBox, which is free for non-commercial use. In the same year, an open source version was released.

In 2008, it was acquired by Sun, which in turn was acquired by Oracle. Oracle has maintained free use of the product for non-commercial purposes.

VirtualBox supports three formats of virtual disks - VDI (native), VMDK (VMware), VHD (Microsoft). The host OS supported are Windows, macOS, Linux, Solaris, and OpenSolaris. The fork of VirtualBox for FreeBSD is known.

Ibm

The mainframe is the main computer of the data center with a large amount of internal and external memory (for reference: in the 60s, 1MB of memory was considered unrealistically large). Actually, the mainframe was a computing center: the first computers occupied entire machine rooms and consisted of huge racks. Today it is called data centers. But in the data centers in the same machine room can be thousands of computers, and at the dawn of computing technology, one computer occupied an entire room. Each rack sold one (!) Computer device (separate racks with memory, separate racks with storage devices, and separately peripheral devices). The core of this huge machine was a rack with a processor - it was called the main, or mainframe. After switching to transistor integrated circuits, the size of this miracle of scientific and engineering thought significantly decreased, and the mainframe of IBM and their analogues began to be understood as the mainframe.

In the 60s of the XX century, the rental of computing power of the whole mainframe, not to mention its purchase, cost a lot of money. Very few companies and institutions could afford such a luxury. Leasing computing power was hourly (the prototype of the modern Pay as you go model in public clouds, don’t you?). Access to tenants for calculations was granted sequentially. The logical solution was to parallelize the computational load and isolate the tenants ’calculations from each other.

For the first time, the idea of isolating several instances of operating systems on one mainframe was proposed by the IBM Cambridge Science Center based on the IBM System / 360-67 mainframe. The development was called CP / CMS and, in fact, was the first hypervisor and provided paravirtualization. CP (Control Program) - the hypervisor itself, which created several independent "virtual machines" (VM). CMS (originally the Cambridge Monitor System, later renamed the Conversational Monitor System) was a lightweight single-user operating system. Curiously, CMS is still alive and is still used in the latest generation of z / VM mainframes. It should be noted that at that time and up to the 90s, a virtual machine meant a logical separation of physical disks (disks or storage devices were shared, the hypervisor did not provide storage for their own needs) with a dedicated piece of virtual memory and processor time using Time- technology Sharing. VMs did not provide for network interaction, since VMs of that time were about computing and storing data, and not about transferring them. In this sense, VMs of that time were more like containers than VMs in the modern sense.

The first commercial hypervisor based on CP / CMS, called VM / 370, appeared on the System / 370 series mainframes on August 2, 1972. The general name of this family of operating systems is VM, and within the framework of this section, VM will mean exactly the IBM hypervisor. The ability to run multiple operating systems at the same time, guaranteeing system stability and isolating users from each other (an error in the OS of one user could not affect the calculations of another user) - was revolutionary and became a key factor in the commercial success of VM / 370. A curious fact: at that time in the USSR the efforts of the Research Institute of Computer Science (Minsk) very successfully cloned the System / 370 architecture and created its own analogue VM / 370 under the name of the EU computer (with support for embedded virtualization! - for the possibility of developing the most basic OS). Such mainframes were used by research institutes and defense enterprises of the socialist camp.

The 80s can safely be called the "mainframe era." VM was a success with developers of operating systems, applications were written for it and calculations were made. This was the decade when the share of databases dominated by the VM OS began to prevail in mainframes. One of the most important changes was the logical partitions (Logical Partition Access Resources or LPAR), which actually provided two levels of virtualization. Clients could now use the same set of processors, I / O devices, and modems in VM systems running in different LPARs and allowing resources to be migrated from one VM system to another. This allowed IT organizations to deliver consistent performance while processing workload spikes. To streamline the growing customer base, VM was divided into three separate products, available in the late 80s:

VM / SP is the standard multipurpose virtualization operating system for System z servers

VM / SP HPO (High Performance Option) - High performance VM / SP for older System z server models

VM / XA (extended architecture) is a variant of VM with support for the extended S / 370 architecture.

In the early 90s, the simplicity and convenience of x86 architecture became more attractive to customers, and mainframes were rapidly losing relevance. Mainframes have been replaced by cluster systems, like grunge, which replaced glam metal at the same time. However, for a certain class of tasks, for example, when building a centralized data warehouse, mainframes justify themselves both in terms of productivity and from an economic point of view. Therefore, some enterprises still use mainframes in their infrastructures, and IBM designs, releases and supports new generations.

Linux Xen

Xen (pronounced zen) is a hypervisor developed at the University of Cambridge Computer Lab under the direction of Ian Pratt and distributed under the GPL. The first public version appeared in 2003. Subsequently, Ian continued to work on the hypervisor in its commercial version, establishing the company XenSource.

In 2013, Xen came under the control of the Linux Foundation.

XenSource

Having existed for several years on the market with XenServer and XenEnterprise products, at the end of 2007 it was acquired by Citrix.

Citrix XenServer

Having absorbed XenSource for $ 500 million, Citrix was unable to commercialize the issue. Or rather, I didn’t really try to do it, not considering XenServer as the main product, and relying on the cheapness of permanent licenses. After frankly unsuccessful sales amid the highly successful VMware ESX, it was decided to release XenServer into the world for free and with full open source in 2009. However, the XenCenter proprietary management system code did not open.

One cannot fail to note the interesting chronological coincidence of Citrix and Microsoft initiatives in the field of industrial virtualization, despite the fact that the companies were always very close.

Despite their common marketing name, Citrix XenApp and XenDesktop have nothing to do with the Xen hypervisor.

Amazon

Amazon unveiled its public IaaS cloud offering called EC2 (Elastic Compute) in 2006. Initially, the EC2 platform used the Xen hypervisor, and subsequently Amazon divided the platform into three parts, each of which used a separate branch and version of the hypervisor to minimize the effect of errors in the code on service availability.

In 2017, KVM for heavy loads appeared as an additional hypervisor in EC2. There are opinions that this indicates the gradual transfer of EC2 to KVM entirely in the future.

Linux QEMU / KVM

QEMU (Quick EMUlator) is a universal software for emulating hardware of various platforms, distributed under the GPL v2 license. In addition to x86, ARM, MIPS, RISC-V, PowerPC, SPARC, SPARC64 are also supported. With the versatility of a platform with full virtualization, QEMU lacked performance comparable to a non-virtualized system. To speed up the work of QEMU on x86, two main options were offered, which were ultimately rejected in favor of Qumranet's KVM (Kernel-based Virtual Machine) development.

We say KVM - we mean QEMU KVM, and accordingly we get the qcow2 virtual disk format (QEMU copy-on-write 2) for all platforms based on the KVM hypervisor.

Although QEMU initially functions as a second type of hypervisor, QEMU / KVM is a first type of hypervisor.

Qumranet

An Israeli company, a former developer and main sponsor of the KVM hypervisor and SPICE protocol. Founded in 2005, gained fame after incorporating KVM into the Linux kernel. September 4, 2008, acquired by Red Hat.

Red hat

Like all GNU / Linux distribution manufacturers, until 2010, Red Hat had built-in support for the Xen hypervisor in their distributions. However, being a major player in the market and a serious brand, I thought about my own implementation of the hypervisor. The basis was then taken by the unremarkable, but promising KVM hypervisor. The first version of Red Hat Enterprise Virtualization 2.2 (RHEV) was introduced in 2010 with a claim to compete for a piece of the VDI solutions market with Citrix and VMware due to the development of Qumranet, which was acquired two years earlier. Out of the box, high availability clusters, Live Migration, and M2M migration tools (RHEL only) were available. It is noteworthy that, judging by the documentation of that time, Red Hat retained the Xen notation when describing the solution architecture.

October 28, 2018, IBM announced the purchase of Red Hat.

Openstack

Historically, the OpenStack project emerged as an initiative to contrast something with VMware's actual monopoly in the area of x86 heavy server virtualization. The project appeared in 2010 thanks to the joint efforts of Rackspace Hosting (a cloud provider) and NASA (which opened the code for its own Nebula platform). The piquancy of the situation was given by the fact that in 2012 VMware joined the OpenStack project management and caused a wave of indignation among the founding activists.

Over time, Canonical (Ubuntu Linux), Debian, SUSE, Red Hat, HP, Oracle joined the project.

However, not everything was smooth. In 2012, NASA left the project, opting for AWS. In early 2016, HPE completely closed its Helion project based on OpenStack.

As part of the OpenStack project, KVM has been adopted as the standard hypervisor.However, due to the modularity of the approach, an OpenStack-based system can be implemented using other hypervisors, leaving, for example, only a control system from OpenStack.

There is a wide range of opinions regarding the OpenStack project, from enthusiastic worship to serious skepticism and harsh criticism. The criticism is not without reason - a significant number of problems and data losses were recorded when using OpenStack. That, however, does not stop fans from denying everything and referring to curvature in the implementation and operation of systems.

The OpenStack project is not limited solely to virtualization, but over time has grown into a significant number of various subprojects and components for expansion in the area of the public cloud service stack. Moreover, the significance of OpenStack should probably be evaluated precisely in this part - these components have become key in many commercial products and systems both in the field of virtualization and beyond.

In Russia, OpenStack outside of public clouds is largely known primarily for its role in import substitution. The vast majority of virtualization solutions and products, including hyperconverged systems, are packaged by OpenStack with varying degrees of refinement.

Nutanix AHV

Nutanix has been a product and platform exclusively for VMware vSphere since its inception. However, partly because of the desire to expand the offer for other hypervisors, partly because of the political crisis in relations with VMware, it was decided to develop their own hypervisor, which would complete the boxed platform and allow to abandon third-party products. KVM was chosen as its own hypervisor, which was called AHV (Acropolis HyperVisor) within the platform.

Parallels

In version 7 of Virtuozzo, the company switched from its own hypervisor to KVM.

Proxmox

Proxmox VE (Virtual Environment) is an open source project of the Austrian company Proxmox Server Solutions GmbH based on Debian Linux. The first release was in 2008.

The product supports LXC container virtualization (formerly OpenVZ), and full virtualization with the KVM hypervisor.

Parallels / Virtuozzo / Rosplatform

Founded in 1999 by Sergey Belousov, SWsoft took up hosting management software. In 2003, the Novosibirsk rival company Plesk was acquired.

In 2004, SWsoft acquired the Russian company Parallels Nikolai Dobrovolsky with its product Parallels Workstation (desktop hypervisor of the second type under Windows).

The combined company retains its name Parallels and will soon explode the market with Parallels Desktop for Mac (desktop hypervisor of the second type for MacOS).

As part of server virtualization, the focus continues on hosting providers and data centers, rather than corporate use. Due to the specifics of this particular market, Virtuozzo and OpenVZ containers, rather than system virtual machines, became the key product. Subsequently, Parallels, without much success, is trying to enter the enterprise server virtualization market with the Parallels Bare Metal Server product (subsequently Parallels Hypervisor and Cloud Server, and then Virtuozzo), adds hyper-convergence with its Cloud Storage. Work continues on the automation and orchestration of hosting providers.

In 2015, based on server virtualization products, the Rosplatform platform project is created - technically (omitting legal and organizational issues) the same Virtuozzo, only with modified wallpapers and in the Russian software registry. Based on Rosplatform platform software and Depo equipment, IBS creates a Scala-R package hyperconverged offer.

Prior to version 7, Virtuozzo used a hypervisor of its own design; in version 7, a transition to KVM was made. Accordingly, Rosplatform is also based on KVM.

After several mergers, acquisitions and rebrandings, the next picture is formed by 2019.

Parallels Desktop is a subsidiary of Parallels and sold to Corel. All automation went to Odin and sold to IngramMicro. Server virtualization remained under the Virtuozzo / Rosplatform platform brand.