Comparison Operations in C ++ 20

The meeting in Cologne has passed, the C ++ 20 standard has been reduced to a more or less finished look (at least until the appearance of special notes), and I would like to talk about one of the upcoming innovations. This is a mechanism that is usually called operator <=> (the standard defines it as a “three-way comparison operator”, but it has the informal nickname “spaceship”), but I believe that its scope is much wider.



We will not just have a new operator - the semantics of comparisons will undergo significant changes at the level of the language itself.



Even if you can’t get anything else out of this article, remember this table:

Equality

Streamlining

Basic

==

<=>

Derivatives

! =

< , > , <= , > =



Now we will have a new operator, <=> , but, more importantly, the operators are now systematized. There are basic operators and there are derivative operators - each group has its own capabilities.



We will talk about these features briefly in the introduction and discuss them in more detail in the following sections.



Basic operators can be inverted (i.e. rewritten with the reverse order of parameters). Derived statements can be rewritten through the corresponding base statement. Neither converted nor rewritten candidates generate new functions; they are simply replacements at the source code level and are selected from an extended set of candidates . For example, the expression a <9 can now be evaluated as a.operator <=> (9) <0 , and the expression 10! = B as ! Operator == (b, 10) . This means that it will be possible to get by with one or two operators where, in order to achieve the same behavior, you now need to manually write 2, 4, 6, or even 12 operators. A brief overview of the rules will be presented below along with a table of all possible transformations.



Both basic and derivative operators can be defined as the default ones . In the case of basic operators, this means that the operator will be applied to each member in the declaration order; in the case of derived operators, that rewritten candidates will be used.



It should be noted that there is no such transformation in which an operator of one kind (i.e., equality or ordering) could be expressed through an operator of another kind. In other words, the columns in our table are in no way dependent on each other. The expression a == b will never be evaluated as operator <=> (a, b) == 0 implicitly (but, of course, nothing prevents you from defining your operator == via operator <=> if you want).



Consider a small example in which we show how the code looks before and after applying the new functionality. We will write a string type that is not case sensitive, CIString , whose objects can be compared both with each other and with char const * .



In C ++ 17, for our task, we need to write 18 comparison functions:



class CIString { string s; public: friend bool operator==(const CIString& a, const CIString& b) { return assize() == bssize() && ci_compare(asc_str(), bsc_str()) == 0; } friend bool operator< (const CIString& a, const CIString& b) { return ci_compare(asc_str(), bsc_str()) < 0; } friend bool operator!=(const CIString& a, const CIString& b) { return !(a == b); } friend bool operator> (const CIString& a, const CIString& b) { return b < a; } friend bool operator>=(const CIString& a, const CIString& b) { return !(a < b); } friend bool operator<=(const CIString& a, const CIString& b) { return !(b < a); } friend bool operator==(const CIString& a, const char* b) { return ci_compare(asc_str(), b) == 0; } friend bool operator< (const CIString& a, const char* b) { return ci_compare(asc_str(), b) < 0; } friend bool operator!=(const CIString& a, const char* b) { return !(a == b); } friend bool operator> (const CIString& a, const char* b) { return b < a; } friend bool operator>=(const CIString& a, const char* b) { return !(a < b); } friend bool operator<=(const CIString& a, const char* b) { return !(b < a); } friend bool operator==(const char* a, const CIString& b) { return ci_compare(a, bsc_str()) == 0; } friend bool operator< (const char* a, const CIString& b) { return ci_compare(a, bsc_str()) < 0; } friend bool operator!=(const char* a, const CIString& b) { return !(a == b); } friend bool operator> (const char* a, const CIString& b) { return b < a; } friend bool operator>=(const char* a, const CIString& b) { return !(a < b); } friend bool operator<=(const char* a, const CIString& b) { return !(b < a); } };
      
      





In C ++ 20, you can do just 4 functions:



 class CIString { string s; public: bool operator==(const CIString& b) const { return s.size() == bssize() && ci_compare(s.c_str(), bsc_str()) == 0; } std::weak_ordering operator<=>(const CIString& b) const { return ci_compare(s.c_str(), bsc_str()) <=> 0; } bool operator==(char const* b) const { return ci_compare(s.c_str(), b) == 0; } std::weak_ordering operator<=>(const char* b) const { return ci_compare(s.c_str(), b) <=> 0; } };
      
      





I’ll tell you what it all means, in more detail, but first, let's go back a bit and remember how comparisons worked up to the C ++ 20 standard.



Comparisons in Standards from C ++ 98 to C ++ 17



Comparison operations have not changed much since the creation of the language. We had six operators: ==,! = , < , > , <= And > = . The standard defines each of them for built-in types, but in general they obey the same rules. When evaluating any a @ b expression (where @ is one of six comparison operators), the compiler looks for member functions, free functions, and built-in candidates named operator @ , which can be called with type A or B in the specified order. The most suitable candidate is selected from them. That's all. In fact, all the operators worked the same way: the operation < did not differ from << .



Such a simple set of rules is easy to learn. All operators are absolutely independent and equivalent. It doesn't matter what we humans know about the fundamental relationship between the operations == and ! = . In terms of language, this is one and the same. We use idioms. For example, we define the operator ! = Through == :



 bool operator==(A const&, A const&); bool operator!=(A const& lhs, A const& rhs) { return !(lhs == rhs); }
      
      





Similarly, through the operator < we define all other relation operators. We use these idioms because, despite the rules of the language, we actually do not consider all six operators to be equivalent. We accept that two of them are basic ( == and < ), and through them all the others are already expressed.



In fact, the Standard Template Library is built entirely on these two operators, and the vast number of types in the exploited code contain definitions of only one of them or both of them.



However, the < operator is not very suitable for the base role for two reasons.



First, other relationship operators cannot be guaranteed to express through it. Yes, a> b means exactly the same as b <a , but it is not true that a <= b means exactly the same as ! (B <a) . The last two expressions will be equivalent if there is a property of trichotomy, in which for any two values ​​only one of the three statements is true: a <b , a == b or a> b . In the presence of trichotomy, the expression a <= b means that we are dealing with either the first or second case ... and this is equivalent to the statement that we are not dealing with the third case. Therefore (a <= b) ==! (A> b) ==! (B <a) .



But what if the attitude does not possess the property of trichotomy? This is characteristic of partial order relations. A classic example is floating point numbers for which any of the operations 1.f <NaN , 1.f == NaN and 1.f> NaN gives false . Therefore, 1.f <= NaN also gives a lie , but at the same time ! (NaN <1.f) is true .



The only way to implement the <= operator in general terms through the basic operators is to paint both operations as (a == b) || (a <b) , which is a big step backwards if we still have to deal with linear order, since then not one function will be called, but two (for example, the expression “abc..xyz9” <= “abc ..xyz1 " will have to be rewritten as (" abc..xyz9 "==" abc..xyz1 ") || (" abc..xyz9 "<" abc..xyz1 ") and twice to compare the entire line).



Secondly, the operator <is not very suitable for the role of the base one due to the peculiarities of its use in lexicographic comparisons. Programmers often make this mistake:



 struct A { T t; U u; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u; } bool operator< (A const& rhs) const { return t < rhs.t && u < rhs.u; } };
      
      





To define the == operator for a collection of elements, it’s enough to apply == to each member once, but this will not work with the < operator. From the point of view of this implementation, the sets A {1, 2} and A {2, 1} will be considered equivalent (since none of them is less than the other). To fix this, apply the < operator twice to each member except the last:



 bool operator< (A const& rhs) const { if (t < rhs.t) return true; if (rhs.t < t) return false; return u < rhs.u; }
      
      





Finally, to guarantee the correct operation of comparisons of heterogeneous objects - i.e. to ensure that the expressions a == 10 and 10 == a mean the same thing - they usually recommend writing comparisons as free functions. In fact, this is generally the only way to implement such comparisons. This is inconvenient because, firstly, you have to monitor compliance with this recommendation, and secondly, usually you have to declare such functions as hidden friends for a more convenient implementation (i.e. inside the class body).



Note that when comparing different types of objects it is not always necessary to write operator == (X, int) ; they may also mean cases where int can be implicitly cast to X.



Let's summarize the rules to the C ++ 20 standard:





New basic ordering operator: <=>



The most significant and noticeable change in the work of comparisons in C ++ 20 is the addition of a new operator - operator <=> , a three-way comparison operator.



We are already familiar with three-way comparisons by the functions memcmp / strcmp in C and basic_string :: compare () in C ++. They all return an int value, which is represented by an arbitrary positive number if the first argument is greater than the second, 0 if they are equal, and an arbitrary negative number otherwise.



The “spaceship” operator does not return an int value, but an object that belongs to one of the comparison categories, whose value reflects the type of relationship between the compared objects. There are three main categories:





You will mainly work with the strong_ordering category; This is also the optimal category for use by default. For example, 2 <=> 4 returns strong_ordering :: less , and 3 <=> -1 returns strong_ordering :: greater .



Categories of a higher order can be implicitly reduced to categories of a weaker order (i.e., strong_ordering is reduced to weak_ordering ). In this case, the current type of relationship is preserved (i.e., strong_ordering :: equal turns into weak_ordering :: equivalent ).



The values ​​of the comparison categories can be compared with literal 0 (not with any int and not with int equal to 0 , but simply with literal 0 ) using one of six comparison operators:



 strong_ordering::less < 0 // true strong_ordering::less == 0 // false strong_ordering::less != 0 // true strong_ordering::greater >= 0 // true partial_ordering::less < 0 // true partial_ordering::greater > 0 // true // unordered -  ,   //       partial_ordering::unordered < 0 // false partial_ordering::unordered == 0 // false partial_ordering::unordered > 0 // false
      
      





It is thanks to a comparison with the literal 0 that we can implement the relation operators: a @ b is equivalent to (a <=> b) @ 0 for each of these operators.



For example, 2 <4 can be calculated as (2 <=> 4) <0 , which turns into strong_ordering :: less <0 and gives the value true .



The <=> operator fits the role of the basic element much better than the < operator, since it eliminates both problems of the latter.



First, the expression a <= b is guaranteed to be equivalent to (a <=> b) <= 0 even with partial ordering. For two unordered values, a <=> b will give the value partial_ordered :: unordered , and partial_ordered :: unordered <= 0 will give false , which is what we need. This is possible because <=> can return more varieties of values: for example, the partial_ordering category contains four possible values. A value of type bool can only be true or false , so before we could not distinguish between comparisons of ordered and unordered values.



For clarity, consider an example of a partial order relationship that is not related to floating point numbers. Suppose we want to add an NaN state to an int type, where NaN is just a value that does not form an ordered pair with any value involved. You can do this using std :: optional to store it:



 struct IntNan { std::optional<int> val = std::nullopt; bool operator==(IntNan const& rhs) const { if (!val || !rhs.val) { return false; } return *val == *rhs.val; } partial_ordering operator<=>(IntNan const& rhs) const { if (!val || !rhs.val) { //  unordered   //     return partial_ordering::unordered; } // <=>   strong_ordering  int, //        partial_ordering return *val <=> *rhs.val; } }; IntNan{2} <=> IntNan{4}; // partial_ordering::less IntNan{2} <=> IntNan{}; // partial_ordering::unordered //     .    IntNan{2} < IntNan{4}; // true IntNan{2} < IntNan{}; // false IntNan{2} == IntNan{}; // false IntNan{2} <= IntNan{}; // false
      
      





The <= operator returns the correct value because now we can express more information at the level of the language itself.



Secondly, to get all the necessary information, it’s enough to apply <=> once, which facilitates the implementation of lexicographic comparison:

 struct A { T t; U u; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u; } strong_ordering operator<=>(A const& rhs) const { //    //  t.   != 0 (..  t // ),    //   if (auto c = t <=> rhs.t; c != 0) return c; //     //    return u <=> rhs.u; };
      
      





See P0515 , the original sentence for adding operator <=>, for a more detailed discussion .



New operator features



We do not just get at our disposal a new operator. In the end, if the example shown above with the declaration of structure A only said that instead of x <y we now have to write (x <=> y) <0 every time, nobody would like it.



The mechanism for resolving comparisons in C ++ 20 is noticeably different from the old approach, but this change is directly related to the new concept of two basic comparison operators: == and <=> . If earlier it was an idiom (recording via == and < ), which we used, but which the compiler did not know about, now he will understand this difference.



Once again, I’ll give a table that you already saw at the beginning of the article:

Equality

Streamlining

Basic

==

<=>

Derivatives

! =

< , > , <= , > =



Each of the basic and derivative operators received a new ability, which I will say a few words further.



Inversion of basic operators



As an example, take a type that can only be compared with int :



 struct A { int i; explicit A(int i) : i(i) { } bool operator==(int j) const { return i == j; } };
      
      





From the point of view of the old rules, it is not surprising that the expression a == 10 works and evaluates to a.operator == (10) .



But what about 10 == a ? In C ++ 17, this expression would be considered a clear syntax error. There is no such operator. For this code to work, you would have to write a symmetric operator == , which would first take the value of int , and then A ... and to implement this would have to be in the form of a free function.



In C ++ 20, basic operators can be inverted. For 10 == a, the compiler will find the candidate operator == (A, int) (in fact, this is a member function, but for clarity, I write it here as a free function), and then additionally - a variant with the reverse order of parameters, i.e. . operator == (int, A) . This second candidate coincides with our expression (and ideally), so we will choose it. The expression 10 == a in C ++ 20 is evaluated as a.operator == (10) . The compiler understands that equality is symmetrical.



Now we expand our type so that it can be compared with int not only through the equality operator, but also through the ordering operator:



 struct A { int i; explicit A(int i) : i(i) { } bool operator==(int j) const { return i == j; } strong_ordering operator<=>(int j) const { return i <=> j; } };
      
      





Again, the expression a <=> 42 works fine and is calculated according to the old rules as a.operator <=> (42) , but 42 <=> a would be wrong from the point of view of C ++ 17, even if the operator < => already existed in the language. But in C ++ 20, operator <=> , like operator == , is symmetric: it recognizes inverted candidates. For 42 <=> a, a member function operator <=> (A, int) will be found (again, I am writing it here as a free function just for greater clarity), as well as a synthetic candidate operator <=> (int, A) . This reversed version exactly matches our expression - we select it.



However, 42 <=> a is NOT calculated as a.operator <=> (42) . That would be wrong. This expression evaluates to 0 <=> a.operator <=> (42) . Try to figure out why this entry is correct.



It is important to note that the compiler does not create any new functions. When calculating 10 == a , the new operator operator == (int, A) did not appear, and when calculating 42 <=> a , operator <=> (int, A) did not appear. Just two expressions are rewritten through inverted candidates. I repeat: no new functions are created.



Also note that the record with the reverse order of parameters is available only for basic operators, but for derivatives it is not. I.e:



 struct B { bool operator!=(int) const; }; b != 42; // ok   C++17,   C++20 42 != b; //    C++17,   C++20
      
      





Rewriting Derived Operators



Let's go back to our example with structure A :



 struct A { int i; explicit A(int i) : i(i) { } bool operator==(int j) const { return i == j; } strong_ordering operator<=>(int j) const { return i <=> j; } };
      
      





Take the expression a! = 17 . In C ++ 17, this is a syntax error because the operator! = Operator does not exist. However, in C ++ 20, for expressions containing derivative comparison operators, the compiler will also search for the corresponding basic operators and express derivative comparisons through them.



We know that in mathematics, the operation ! = Essentially means NOT == . Now this is known to the compiler. For the expression a! = 17, he will look not only for the operator! = Operators , but also operator == (and, as in the previous examples, the inverted operator == ). For this example, we found an equality operator that almost suits us - we just need to rewrite it in accordance with the desired semantics: a! = 17 will be calculated as ! (A == 17) .



Similarly, 17! = A is calculated as ! A.operator == (17) , which is both a rewritten and an inverted version.



Similar transformations are also carried out for ordering operators. If we wrote a <9 , we would try (unsuccessfully) to find operator < , and also consider the basic candidates: operator <=> . The corresponding replacement for the relation operators looks like this: a @ b (where @ is one of the relation operators) is calculated as (a <=> b) @ 0 . In our case, a.operator <=> (9) <0 . Similarly, 9 <= a is calculated as 0 <= a.operator <=> (9) .



Note that, as in the case of the call, the compiler does not create any new functions for the rewritten candidates. They are simply calculated differently, and all transformations are carried out only at the source code level.



The above leads me to the following advice:



BASIC OPERATORS ONLY : Define only basic operators (== and <=>) in your type.



Since the basic operators give the whole set of comparisons, it is enough to define only them. This means that you need only 2 operators to compare objects of the same type (instead of 6, as of now) and only 2 operators to compare different types of objects (instead of 12). If you only need the equality operation, then just write 1 function to compare objects of the same type (instead of 2) and 1 function to compare different types of objects (instead of 4). The std :: sub_match class is an extreme case: in C ++ 17, it uses 42 comparison operators, and in C ++ 20 it uses only 8, while the functionality does not suffer in any way.



Since the compiler also considers inverted candidates, all these operators can be implemented as member functions. You no longer have to write free functions just for the sake of comparing objects of different types.



Special rules for finding candidates



As I already mentioned, the search for candidates for a @ b in C ++ 17 was carried out according to the following principle: we find all operator @ operators and select the most suitable one from them.



C ++ 20 uses an extended set of candidates. Now we will search all operator @ . Let @@ be the base operator for @ (it can be the same operator). We also find all operator @@ and for each of them we add its inverted version. From all these candidates found, we select the most suitable.



Note that operator overloading is permitted in a single pass. We are not trying to substitute different candidates. First we collect them all, and then choose the best one from them. If this does not exist, the search, as before, fails.



Now we have much more potential candidates, and therefore more uncertainty. Consider the following example:



 struct C { bool operator==(C const&) const; bool operator!=(C const&) const; }; bool check(C x, C y) { return x != y; }
      
      





In C ++ 17, we only had one candidate for x! = Y , and now there are three: x.operator! = (Y) ,! X.operator == (y) and ! Y.operator == (x) . What to choose? They are all the same! (Note: the candidate y.operator! = (X) does not exist, since only basic operators can be inverted .)



Two additional rules have been introduced to remove this uncertainty. Unconverted candidates are preferable to converts; . , x.operator!=(y) «» !x.operator==(y) , «» !y.operator==(x) . , «» .



: operator@@ . . , .



-. — (, x < y , — (x <=> y) < 0 ), (, x <=> y void - , DSL), . . , bool ( : operator== bool , ?)



For example:



 struct Base { friend bool operator<(const Base&, const Base&); // #1 friend bool operator==(const Base&, const Base&); }; struct Derived : Base { friend void operator<=>(const Derived&, const Derived&); // #2 }; bool f(Derived d1, Derived d2) { return d1 < d2; }
      
      





d1 < d2 : #1 #2 . — #2 , , , . , d1 < d2 (d1 <=> d2) < 0 . , void 0 — , . , - , #1 .





, , C++17, . , - . :





, . .



. , , , , , ( ). , :



Option 1

Option 2

a == b

b == a



a != b

!(a == b)

!(b == a)

a <=> b

0 <=> (b <=> a)



a < b

(a <=> b) < 0

(b <=> a) > 0

a <= b

(a <=> b) <= 0

(b <=> a) >= 0

a > b

(a <=> b) > 0

(b <=> a) < 0

a >= b

(a <=> b) >= 0

(b <=> a) <= 0



« » , , .. a < b 0 < (b <=> a) , , , .





C++17 . . :



 struct A { T t; U u; V v; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u && v == rhs.v; } bool operator!=(A const& rhs) const { return !(*this == rhs); } bool operator< (A const& rhs) const { //    ,     , //     ?:  &&/|| if (t < rhs.t) return true; if (rhs.t < t) return false; if (u < rhs.u) return true; if (rhs.u < u) return false; return v < rhs.v; } bool operator> (A const& rhs) const { return rhs < *this; } bool operator<=(A const& rhs) const { return !(rhs < *this); } bool operator>=(A const& rhs) const { return !(*this < rhs); } };
      
      





- std::tie() , .



, : :



 struct A { T t; U u; V v; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u && v == rhs.v; } strong_ordering operator<=>(A const& rhs) const { //   T if (auto c = t <=> rhs.t; c != 0) return c; // ...  U if (auto c = u <=> rhs.u; c != 0) return c; // ...  V return v <=> rhs.v; } };
      
      





. <=> < . , . c != 0 , , ( ), .



. C++20 , :



 struct A { T t; U u; V v; bool operator==(A const& rhs) const = default; strong_ordering operator<=>(A const& rhs) const = default; };
      
      





, . , :



 struct A { T t; U u; V v; bool operator==(A const& rhs) const = default; auto operator<=>(A const& rhs) const = default; };
      
      





. , , :



 struct A { T t; U u; V v; auto operator<=>(A const& rhs) const = default; };
      
      





, , . : operator== , operator<=> .





C++20: . . , , , .





PVS-Studio , <=> . , -. , , (. " "). ++ .



PVS-Studio <, :



 bool operator< (A const& rhs) const { return t < rhs.t && u < rhs.u; }
      
      





. , - . .



: Comparisons in C++20 .



All Articles