Huawei Dorado V6: Sichuan Heat



Summer in Moscow this year turned out, frankly, not very. It began too early and quickly, not everyone managed to react to it, and it ended already at the end of June. Therefore, when Huawei suggested that I go to China, to the city of Chengdu, where their RnD center is located, looking at the weather forecast at +34 degrees in the shade, I immediately agreed. Still, my age is not the same and my bones need to be warmed up a little. But I want to note that it was possible to warm not only the bones, but also the insides, because the Sichuan province, in which, in fact, Chengdu is famous for its love of spicy food. But still, this blog is not about travel, so back to the main goal of our trip - the new line of storage systems - Huawei Dorado V6. This article will wave you a little from the past, as It was written before the official announcement, but published only after release. And so, today we look at everything interesting and tasty that Huawei has prepared for us.









The new line will have 5 models. All models except 3000V6 can be in two versions - SAS and NVMe. The disk interface that you can use in this system, the back-end ports and the number of disk drives that you can install in the system depend on the choice. NVMe uses Palm-sized SSDs, which are thinner than the classic 2.5 ”SAS SSDs and can accommodate up to 36 units. The new line is All Flash and there are no disk configurations.









Palm NVMe SSD







In my opinion, the Dorado 8000 and 18000 appear to be the most interesting models. Huawei is positioning them as a High-end system, and, thanks to Huawei's pricing policy, it contrasts these Mid-range models with a competitor segment. It is on these models that I will concentrate today in my review. Immediately, I note that due to its design features, the younger dual-controller systems have a slightly different architecture, different from the Dorado 8000 and 18000, so not everything that I will talk about today applies to younger models.







One of the main features of the new systems was the use of several chips of their own design, each of which allows to carry the logical load from the central processor of the controller and add functionality to different components.







The heart of the new systems is the Kunpeng 920 processors, developed on ARM technologies and manufactured independently by Huawei. Depending on the model, the number of cores varies, their frequency and the number of installed processors in each controller:

Huawei Dorado V6 8000 - 2CPU, 64 core

Huawei Dorado V6 18000 - 4CPU, 48 core







Huawei developed this processor on the ARM architecture, and as far as I know, it originally planned to put it only in the older Dorado 8000 and 18000 models, as it was with some V5 models, but the sanctions made adjustments to this idea. Of course, ARM also talked about the refusal of cooperation with Huawei during the imposition of sanctions, but here the situation is different than with Intel. Huawei is making these chips on its own, and no sanctions can stop this process. The severance of relations with ARM threatens only with a loss of access to new developments. As for performance - here it will be possible to judge only after conducting independent tests. Although I saw how 1M IOPS was removed from the Dorado 18000 system without any problems, until I do it myself with my own hands in the rack, I won’t believe it. But the capacities in the controllers there really are not enough. The older models are equipped with 4 controllers, each of which has 4 processors installed, which gives a total of 768 cores.







But I’ll tell you about the kernels even later, when we look at the architecture of new systems, but for now we’ll return to another chip installed in the system. The Ascend 310 chip looks like an extremely interesting solution (as I understand it, the Ascend 910 younger brother, which was recently introduced to the public). Its task is to analyze the data blocks arriving at the system to increase the Read hit ratio. It is still difficult to say how he will show himself in work, because Today it works only according to a given pattern and does not have the ability to learn in an intellectual mode. The appearance of an intelligent mode is promised in future firmware, most likely at the beginning of next year.







Let's move on to architecture. Huawei has continued to develop its own Smart Matrix technology, which implements a full mesh approach to connecting components. But if in V5 it was only for access from controllers to disks, now all controllers have access to all ports on both the Back-End and Front-End.







Thanks to the new microservice architecture, this also allows balancing the load between all controllers, even if there is only one lun. The OS for this line of arrays was developed from scratch, and not just optimized for the use of Flash-drives. Due to the fact that all the controllers have access to the same ports, in the event of a controller failure or reboot, the host does not lose a single path to the storage system, and path switching is performed at the data storage system level. At the same time, using UltraPath on the host is not a strict necessity. Another “savings” in the installation of the system is a smaller number of necessary links. And if with the “classical” approach for 4 controllers we need 8 links from 2 factories, then in the case of Huawei even 2 will be enough (I'm not talking now about the sufficiency of the bandwidth of one link).







As in the previous version, a global cache with mirroring is used. This allows you to lose up to two controllers simultaneously or three controllers in series without affecting availability. But it is worth noting that we did not see the full load balancing between the remaining 3 controllers in case of failure of one, at the demo stand. The load of the failed controller was completely taken over by one of the remaining ones. It is possible that for this it is necessary to let the system work for longer in this configuration. In any case, on my own tests I will check this in more detail.

Huawei is positioning new systems as an End-to-End NVMe system, but to date, the NVMeOF frontend is not yet supported, only FC, iSCSI or NFS. At the end of this or the beginning of the next, like other chips, we are promised RoCE support.







Shelves are connected to the controllers in the same way by RoCE and there is one drawback connected with this - the absence of a “loop” connection of shelves, as it was with SAS. In my opinion, while this is a pretty big drawback, if you have planned a fairly large system. The fact is that all shelves are connected in series, and the failure of one of the shelves entails the complete inaccessibility of all the others following it. In this case, to ensure fault tolerance, we have to connect all the shelves to the controllers, which entails an increase in the required number of backend ports in the system.







And one more thing worth mentioning is non-disruptive update (NDU). As I said above, Huawei has implemented a container approach to operating the OS for the new Dorado line, this allows you to update and restart services, without the need for a full reboot of the controller. It is worth mentioning right away that some updates will contain kernel updates, and in this case the classic reboot of the controllers will sometimes still be required when updating, but not always. This will reduce the level of influence of this operation on the productive system.







In our arsenal, the vast majority of arrays from the company NetApp. Therefore, I think it will be quite logical if I make a small comparison with the systems with which I have to work quite a lot. This is not an attempt to determine who is better and who is worse or whose architecture is more advantageous. I’ll try soberly and without fanaticism to compare two different approaches to solving the same problem from different vendors. Yes, of course, in this case, we will consider Huawei systems in “theory” and I will also separately note those moments that are only planned to be implemented in future firmware versions. What are the pluses that I see at the moment:







  1. The number of supported NVMe drives. NetApp today has a number of 288, Huawei depending on the model - 1600-6400. At the same time, Huawei’s Max usable capacity is 32PBe, like NetApp systems (to be more precise, they have 31.64PBe). And this despite the fact that drives of the same volume are supported (up to 15Tb). Huawei explains this fact as follows - they did not have the opportunity to assemble a larger stand. In theory, they have no volume limit, but they simply have not been able to test this fact. But it is worth noting that the capabilities of flash drives are very high today, and in the case of NVMe systems, we are faced with the fact that 24 drives are enough to utilize the top-end 2-controller system. Accordingly, a further increase in the number of disks in the system will not only not give a performance gain, but will also adversely affect the IOPS / Tb ratio. Of course, it's worth seeing how many drives the 4-controller systems 8000 and 16000 can pull out, because the capabilities and potential of the Kunpeng 920 are still not completely clear.
  2. Lun has an owner on NetApp systems. Those. only one controller can perform operations with the moon, while the second only passes IO through itself. Huawei systems, on the contrary, have no owners and operations with data blocks (compression, deduplication) can be performed by any of the controllers, as well as written to disks.
  3. No port drop in case of failure of one of the controllers. For some, this moment looks extremely critical. The bottom line is that switching inside the storage system should be faster than on the host side. And if in the case of the same NetApp, we have in practice revealed a frieze of the order of 5 seconds when pulling out the controller and switching paths, then switching to Huawei has yet to be practiced.
  4. No need to restart the controller during the upgrade. I was especially worried about the fairly frequent release of new versions and firmware branches for NetApps. Yes, some updates for Huawei will still require a restart, but not all.
  5. 4 Huawei controllers for the price of two NetApp controllers. As I said above, thanks to Huawei's pricing policy, it can compete with the Mid-range with its Hight-end models.
  6. The presence of additional chips in the shelf controllers and port cards, which are potentially intended to increase the efficiency of the system.


Cons and fears in general:







  1. Direct connection of shelves to controllers or the need for a large number of back-end ports for connecting all shelves to controllers.
  2. ARM architecture and the presence of a large number of chips - how efficiently it will work, and whether there will be enough performance.


Most fears and fears will be able to dispel the own testing of the new line. I hope that soon after the release they will already appear in Moscow and there will be enough of them to quickly get one for their own tests. So far, it can be said that, on the whole, the company's approach looks interesting, and the new line looks very good against competitors. the final implementation raises a lot of questions, because we will see many things only at the end of the year, and maybe only in 2020.








All Articles