👶🏼 🅰️ ⏳ How to measure the performance of blockchain networks. Key metrics 🐮 ♒️ 🧕🏻

There are many metrics related to the logic and quality of the blockchain. They help identify bottlenecks in the code and find logical and optimization problems in consensus and final algorithms in blockchains. Any development of distributed systems, including blockchains, requires an analysis of the work of many nodes at once. They allow the project team to monitor the state of the entire blockchain network, see problems with individual nodes, detect the occurrence of DoS attacks on the network and much more. Let's look at the main ones. Let's dive in.

“Transactions-per-second”

In the case of distributed systems, TPS is a very moody and ambiguous number that does not always reflect the real quality of the service provided to users. TPS measurements came to us from distributed databases. TPS in databases are some standardized for the test transactions or their sets (some INSERT, some UPDATE, so many DELETEs against the background of constant SELECT) for a hard-coded cluster configuration or even on the same machine. These metrics usually give only rough estimates of the performance of distributed databases or blockchains, since transaction processing time can vary greatly depending on many factors.

Consistency-oriented databases (see “CAP-Theorem”) do not commit the transaction until they receive a sufficient number of confirmations from other nodes and this is slow. And Availability-oriented databases consider a transaction that was simply written to disk a success. They immediately give the client updated data and it is very fast (although in the future, this transaction may be rolled back). Also, if the transactions used in the benchmark update only one cell with data, the TPS will obviously be higher than in cases where transactions can affect many cells and block each other. The algorithms for working with these locks in each database are implemented in their own way - that’s why we don’t see “TPS competitions” between Oracle, MSSQL, PostgreSQL on the one hand and MongoDB, Redis, Tarantool on the other - these are very different internal mechanisms and different tasks .

In my opinion, “measuring TPS” of the blockchain means taking a full range of measurements of its performance:

in repeatable conditions
with a close to reality number of block validators
using various types of transactions:
- typical for the studied blockchain (for example, transfer () of the main cryptocurrency)
- loading storage subsystem (a large amount of changes from each transaction)
- loading bandwidth (large transaction volume)
- CPU-loading (in the case of massive cryptographic transformations or calculations)

To talk about the cherished “transactions per second”, you need to describe all the conditions (the number of validators, their geo-distribution, packetloss level, etc.) and describe the logic of benchmarking. In blockchains, simply rolling a transaction onto an internal database does not mean its adoption by consensus. For example, in the case of Proof-of-Work, statistically, transactions never complete at all, and if a transaction is included in a block on one machine, this does not mean that it will be accepted by the entire network (for example, if another fork wins).

If the blockchain has an additional algorithm for ensuring the finality of transactions (EOS, Ethereum 2.0, Polkadot parachains using consensus with GRANDPA finality), then the processing time can be considered the gap between the node “saw” the transaction and the next finalized block where this transaction was included. Such, closer to reality, “TPS” is rarely seen in project promises. Naturally, they are lower than those described in Whitepaper, but they are as informative as possible.

So I warn you again, a lot of different meanings can be included in the term “TPS”. Be skeptical and ask for details.

Blockchain-specific metrics

Local tps

The number of transactions processed by the node and max / avg / min time of their processing on the local node is very convenient to measure, since the functions that perform this operation are usually explicitly allocated in the code. You can simply measure how long the transaction worked by updating the state database. These transactions may not yet be accepted by consensus, but have already been validated, and the node can already give the client updated data (assuming that the fork chain does not appear).

This metric is not very honest: if another fork of the chain is chosen as the main one, then statistics on rolled back transactions must also be rolled back. But for testing, this can almost always be neglected.

Often, this is the number that is written in brief reports: “our blockchain got 8,000 tps yesterday,” since it’s easy to measure - just one running node and a script that loads it are enough. In this case, there is no network delay, which slows down the network reaching consensus, and the metric shows the state database performance without the influence of the network. This number is not the real bandwidth of the blockchain network, but shows the limit to which it will strive if the consensus and network are fast enough.

The result of any blockchain transaction is several atomic updates to storage. For example, a payment transaction in Bitcoin is the removal of several old UTXOs (delete) and the addition of several new UTXOs (insert), and in Ethereum it is the execution of a short smart contract code and, again, updating several key-value pairs. The number of these “atomic” write operations can be an excellent metric that allows you to identify bottlenecks in the storage subsystem and internal transaction logic.

Also, blockchain nodes can be implemented in several programming languages - this is more reliable. This should be taken into account when evaluating network performance, for example, an Ethereum node exists in implementations on Rust and Go. Other blockchains also seek to have additional implementations for reliability.

Local produced blocks amount

This simple metric shows which validator how many blocks produced. It is a consensus product and can be considered the main one for assessing the “usefulness” for a network of individual validators.

Making money on each block, validators are interested in the stable operation and safety of their machines. This number helps determine which of the candidate validators is the most qualified, protected and prepared to work on a public network with the assets of real users. The metric value can be publicly checked by simply downloading the blockchain and calculating who produced how many blocks.

Finality and Last Irreversible Block

In networks with clearly implemented finality (EOS, Ethereum, Tendermint, Polkadot, etc), in addition to the basic, fast consensus (in which one validator signature per block is enough), some blocks require coordination by a group of validators. These blocks are considered final, and the signature collection algorithm is considered final. The task of finality is to make sure that all transactions included in the blockchain before the finalized block are never pumped out and not replaced by another fork of the chain. This is protection against double spend attacks in proof-of-stake networks, and a way to quickly, in a few seconds, return a reliable confirmation of a cryptocurrency transaction to a user.

From the point of view of the blockchain user, the transaction does not complete at the moment when it is accepted by the node, but when a block appears that finalizes the chain in which the transaction is located. To finalize a block, validators must receive this block on a p2p network, and exchange signatures with each other. It is here that the real speed of the blockchain is checked, because the user is interested in the moment of finalizing the block with its transaction, and not just accepting and writing it to the disk of one of the nodes.

Finality algorithms also differ, intersect, and combine with the main consensus (to read: Casper in Ethereum, Last Irreversible Blocks in EOS, GRANDPA in Parity Polkadot and their modifications, for example MixBytes RANDPA).

For networks where not every block is finalized, a useful metric is the lag of the last finalized block from the current last block. This number shows how the validators are lagging behind, agreeing on the correct chain. If the gap is large, then the finalization algorithm requires additional analysis and optimization.

Other blockchain metrics

The rest of the metrics usually depend heavily on the type of consensus, so it is not very correct to represent them among the main ones. Among these parameters, for example: the number of forks of the chain, their length in blocks, the occupancy of blocks with transactions, etc. They can be used to identify network separation situations or quickly localize problems of a specific validator.

P2P layer

It is extremely important to remember the intermediate basis of blockchain networks - the peer-to-peer subsystem. It is she who introduces vague delays in the delivery of blocks and transactions between validators. When the number of validators is small, they are localized, peer lists are hard-coded, everything works well and quickly. But it’s worth adding validators, distributing nodes geographically and emulating packetloss, as significant failures appear in “tps”.

For example, when testing EOS consensus with the optional finality algorithm, increasing the number of validators even to 80-100 machines spaced across four continents did not significantly affect the speed of reaching finality. At the same time, an increase in packetloss strongly affected the lag of finality, which indicates the need for additional configuration of the p2p layer for greater resistance to loss of network packets, and not to large latency. Unfortunately, there are a lot of different settings and factors, therefore, only benchmarks allow us to understand the effective number of validators that provide a fairly comfortable speed of the blockchain.

The device p2p subsystem can be understood from the documentation, for example, on libp2p or documentation on the Kademlia or BitTorrent protocol.

Important metrics for p2p are:

inbound-outbound traffic
number of successful / unsuccessful connections with other peers
how many times the previously cached chunk data was returned, and how many times it was necessary to forward the request further in search of the desired chunk (analogue of cache hits / misses)

For example, a large number of misses when accessing data means that only a small number of nodes have the requested data, and they do not have time to distribute them to everyone, and the amount of received / given p2p traffic will allow you to establish a node that has problems with the network configuration or channel.

Blockchain node system metrics

The standard system metrics of blockchain nodes are described in a large number of sources, so I will describe them briefly. Their role is to help search for bottlenecks and errors in all parts of the code, showing which subsystems of the nodes are most loaded and what tasks.

CPU

They talk about how many calculations the processor performs. If the CPU load is high, then the node is calculating something, actively using logic or FPU (almost never used in blockchains). In blockchains, this can be, for example, due to the fact that the node checks electronic signatures, processes transactions with heavy cryptography, or makes complex calculations.

The CPU can be “cut” into several more useful metrics to understand which parts of the code are the most expensive. For example, system - kernel code, user - user processes, io - waiting for i / o from slow external devices (disk / network), etc. Here is a good article on the topic.

Memory

Modern blockchains use key-value databases (LevelDB, RocksDB), which constantly keep hot data in their memory. As with any loaded services, memory leaks are always possible as a result of errors or targeted attacks on the node code. If the consumption of the memory node increases or has sharply increased, then this is most likely caused by an increase in the number of keys in the state database, large transaction queues, or an increase in the number of messages between different subsystems of the node. Underloading the memory may indicate a possible increase in data limits in blocks or maximum transaction complexity.

For full-nodes, which is https://habrastorage.org/webt/qa/sn/m5/qasnm5bougkjuagneevjkpg9x0w.png which correspond to network clients, file cache metrics are also important. Clients access various parts of the state database and transaction log. This causes the old blocks to rise from the disk, which can supplant new blocks, which in turn slows down the response to the client.

Network

The main internal metrics of network are the amount of traffic in bytes, the number of sent and received network packets for each and protocols, packet loss ratio. In blockchains, these metrics are often not given much attention, because blockchains do not yet process transactions at a speed of 1 Gbit / sec.

There are blockchain projects that allow users to share their wifi or provide services for storing and transferring files or messages. When testing such networks, the quantity and quality of traffic through the network interface becomes extremely important metrics, as a crowded network channel affects all other services on the machine, without exception.

Storage

The disk subsystem is the slowest component in any service and is often the cause of serious performance problems. Excessive logging, an unexpected backup, an inconvenient read / write pattern, a large total blockchain volume - all this can lead to a significant slowdown of the node or to seriously excessive requirements for hardware.

The transaction log can technically be considered as WAL ( WAL ) for the state database, therefore those storage metrics are important that allow you to search for bottlenecks in the mechanisms of modern key-value databases. This is the number of read / write IOPS, max / min / avg latency and many other metrics that help optimize disk operations.

Conclusion

So, we examined several sets of metrics that can provide very valuable information about the operation of the blockchain network and the possibilities for its optimization. To summarize, you can collect them in three groups:

blockchain metrics of nodes:

the number of blocks produced, the number of processed transactions, the time of their processing, the time of finalization, etc.
subsystem p2p metrics:

the number of hit / miss requests, the number of active peers, the volume and structure of p2p traffic, etc.
system metrics of nodes:

cpu, memory, storage, network, etc.

Each of the groups is important in its own way, since in each of the subsystems there may be errors restricting the operation of other components, and slowing down even a small number of validators can have a serious impact on the entire network. Also, the most tricky errors in consensus and finality algorithms arise only with a large transaction flow or changes in consensus parameters. Their analysis requires reproducible testing conditions and complex load scenarios.

The development of blockchains is always the orchestration of several machines, scripts for laying out configs and coordinated launch of nodes and benchmarks, a server for collecting metrics and logs from all machines. Therefore, when developing your blockchain, consider hiring a qualified devoop - it will provide invaluable support to the development team. Good luck

How to measure the performance of blockchain networks. Key metrics