The benefits of strong typing in C ++: hands-on experience

Our program processes network packets, in particular, TCP / IP / etc headers. In them, numerical values ​​- offsets, counters, addresses - are presented in network byte order (big-endian); we are working on x86 (little-endian). In standard structures describing headers, these fields are represented by simple integer types ( uint32_t



, uint16_t



). After several bugs due to the fact that we forgot to convert the byte order, we decided to replace the field types with classes that prohibit implicit conversions and atypical operations. Under the cut is a utilitarian code and specific examples of errors that strict typing revealed.



Byte order



Likbez for those who are not aware of the byte order (endianness, byte order). In more detail already was on "Habré" .



In the usual notation of numbers, they go from the oldest (left) to the youngest (right) on the left: 432 10 = 4 × 10 2 + 3 × 10 1 + 2 × 10 0 . Integer data types have a fixed size, for example, 16 bits (numbers from 0 to 65535). They are stored in memory as two bytes, for example, 432 10 = 01b0 16 , that is, bytes 01 and b0.



Print the bytes of this number:



 #include <cstdio> // printf() #include <cstdint> // uint8_t, uint16_t int main() { uint16_t value = 0x01b0; printf("%04x\n", value); const auto bytes = reinterpret_cast<const uint8_t*>(&value); for (auto i = 0; i < sizeof(value); i++) { printf("%02x ", bytes[i]); } }
      
      





On conventional Intel or AMD (x86) processors, we get the following:



 01b0 b0 01
      
      





Bytes in memory are located from the youngest to the oldest, and not as when writing numbers. This order is called little-endian (LE). The same is true for 4-byte numbers. The byte order is determined by the processor architecture. The “native” order for the processor is also called the order of the CPU or host (CPU / host byte order). In our case, host byte order is little-endian.



However, the Internet was not born on x86, and there the order of bytes was different - from oldest to youngest (big-endian, BE). They began to use it in the headers of network protocols (IP, TCP, UDP), which is why big-endian is also called network byte order.



Example: port 443 (1bb 16 ), which uses HTTPS, is written in the TCP headers bytes bb 01, which when read will give bb01 16 = 47873.



 //  uint16_t  uint32_t     . struct tcp_hdr { uint16_t th_sport; uint16_t th_dport; uint32_t th_seq; uint32_t th_ack; uint32_t th_flags2 : 4; uint32_t th_off : 4; uint8_t th_flags; uint16_t th_win; uint16_t th_sum; uint16_t th_urp; } __attribute__((__packed__)); tcp_hdr* tcp = ...; //      // : dst_port  BE,  443  LE. if (tcp->dst_port == 443) { ... } // : ++  LE,  sent_seq  BE. tcp->sent_seq++;
      
      





The byte order of a number can be converted. For example, for uint16_t



there is a standard function htons()



( h ost to n etwork for s hort integer - from host order to network order for short integers) and the reverse ntohs()



. Similarly, for uint32_t



there is htonl()



and ntohl()



(long is a long integer).



 // :  BE    BE . if (tcp->dst_port == htons(443)) { ... } //   BE     LE,   1, //   LE    BE. tcp->sent_seq = htonl(ntohl(tcp->sent_seq) + 1);
      
      





Unfortunately, the compiler does not know where the particular value of a variable of type uint32_t



came from, and does not warn if you mix values ​​with different byte order and get an incorrect result.



Strong typing



The risk of confusing byte order is obvious, how to deal with it?





When working with a network, you cannot ignore byte order, so I would like to make it impossible to ignore it when writing code. Moreover, we do not just have a number in BE - it is a port number, IP address, TCP sequence number, checksum. One cannot be assigned to another, even if the number of bits matches.



The solution is known - strict typing, that is, separate types for ports, addresses, numbers. In addition, these types must support BE / LE conversion. Boost.Endian does not suit us, because there is no Boost in the project.



The project size is about 40 thousand lines in C ++ 17. If you create safe wrapper types and rewrite the header structures on them, all places where there is work with BE will automatically stop compiling. We'll have to go through them all once, but the new code will only be safe.



Big-endian class number
 #include <cstdint> #include <iosfwd> #define PACKED __attribute__((packed)) constexpr auto bswap(uint16_t value) noexcept { return __builtin_bswap16(value); } constexpr auto bswap(uint32_t value) noexcept { return __builtin_bswap32(value); } template<typename T> struct Raw { T value; }; template<typename T> Raw(T) -> Raw<T>; template<typename T> struct BigEndian { using Underlying = T; using Native = T; constexpr BigEndian() noexcept = default; constexpr explicit BigEndian(Native value) noexcept : _value{bswap(value)} {} constexpr BigEndian(Raw<Underlying> raw) noexcept : _value{raw.value} {} constexpr Underlying raw() const { return _value; } constexpr Native native() const { return bswap(_value); } explicit operator bool() const { return static_cast<bool>(_value); } bool operator==(const BigEndian& other) const { return raw() == other.raw(); } bool operator!=(const BigEndian& other) const { return raw() != other.raw(); } friend std::ostream& operator<<(std::ostream& out, const BigEndian& value) { return out << value.native(); } private: Underlying _value{}; } PACKED;
      
      









The controversial point here is PACKED



: it is believed that packaged structures are less optimizable. The only answer is to measure. Our code benchmarks did not reveal any slowdowns. In addition, in the case of network packets, the position of the fields in the header is still fixed.



In most cases, BE does not need any operations other than comparison. Sequence numbers need to be folded correctly with LE:



 using BE16 = BigEndian<uint16_t>; using BE32 = BigEndian<uint32_t>; struct Seqnum : BE32 { using BE32::BE32; template<typename Integral> Seqnum operator+(Integral increment) const { static_assert(std::is_integral_v<Integral>); return Seqnum{static_cast<uint32_t>(native() + increment)}; } } PACKED; struct IP : BE32 { using BE32::BE32; } PACKED; struct L4Port : BE16 { using BE16::BE16; } PACKED;
      
      





Secure TCP Header Structure
 enum TCPFlag : uint8_t { TH_FIN = 0x01, TH_SYN = 0x02, TH_RST = 0x04, TH_PUSH = 0x08, TH_ACK = 0x10, TH_URG = 0x20, TH_ECE = 0x40, TH_CWR = 0x80, }; using TCPFlags = std::underlying_type_t<TCPFlag>; struct TCPHeader { L4Port th_sport; L4Port th_dport; Seqnum th_seq; Seqnum th_ack; uint32_t th_flags2 : 4; uint32_t th_off : 4; TCPFlags th_flags; BE16 th_win; uint16_t th_sum; BE16 th_urp; uint16_t header_length() const { return th_off << 2; } void set_header_length(uint16_t len) { th_off = len >> 2; } uint8_t* payload() { return reinterpret_cast<uint8_t*>(this) + header_length(); } const uint8_t* payload() const { return reinterpret_cast<const uint8_t*>(this) + header_length(); } }; static_assert(sizeof(TCPHeader) == 20);
      
      







results



Most edits were trivial. The code is cleaner:



  auto tcp = packet->tcp_header(); - return make_response(packet, - cookie_make(packet, rte_be_to_cpu_32(tcp->th_seq)), - rte_cpu_to_be_32(rte_be_to_cpu_32(tcp->th_seq) + 1), - TH_SYN | TH_ACK); + return make_response(packet, cookie_make(packet, tcp->th_seq.native()), + tcp->th_seq + 1, TH_SYN | TH_ACK); }
      
      





In part, the types documented the code:



 - void check_packet(int64_t, int64_t, uint8_t, bool); + void check_packet(std::optional<Seqnum>, std::optional<Seqnum>, TCPFlags, bool);
      
      





Suddenly, it turned out that you can incorrectly read the size of the TCP window, while unit tests will pass and even traffic will be chased:



  //  window size auto wscale_ratio = options().wscale_dst - options().wscale_src; if (wscale_ratio < 0) { - auto window_size = header.window_size() / (1 << (-wscale_ratio)); + auto window_size = header.window_size().native() / (1 << (-wscale_ratio)); if (header.window_size() && window_size < 1) { window_size = WINDOW_SIZE_MIN; } header_out.window_size(window_size); } else { - auto window_size = header.window_size() * (1 << (wscale_ratio)); + auto window_size = header.window_size().native() * (1 << (wscale_ratio)); if (window_size > WINDOW_SIZE_MAX) { window_size = WINDOW_SIZE_MAX; }
      
      





Example of a logical error: the developer of the original code thought that the function accepts BE, although in fact it is not. When trying to use Raw{}



instead of 0



program simply did not compile (fortunately, this is just a unit test). Immediately we see an unsuccessful choice of data: the error would be found sooner if it were not used 0, which is the same in any byte order.



 - auto cookie = cookie_make_inner(tuple, rte_be_to_cpu_32(0)); + auto cookie = cookie_make_inner(tuple, 0);
      
      





A similar example: first, the compiler pointed out the mismatch between the def_seq



and cookie



types, then it became clear why the test passed earlier - such constants.



 - const uint32_t def_seq = 0xA7A7A7A7; - const uint32_t def_ack = 0xA8A8A8A8; + const Seqnum def_seq{0x12345678}; + const Seqnum def_ack{0x90abcdef}; ... - auto cookie = rte_be_to_cpu_32(_tcph->th_ack); + auto cookie = _tcph->th_ack; ASSERT_NE(def_seq, cookie);
      
      





Summary



The bottom line is:





All three points are important to us, so we think refactoring was worth it.



Do you insure yourself against mistakes with strict types?



All Articles