The meeting in Cologne has passed, the C ++ 20 standard has been reduced to a more or less finished look (at least until the appearance of special notes), and I would like to talk about one of the upcoming innovations. This is a mechanism that is usually called
operator <=> (the standard defines it as a “three-way comparison operator”, but it has the informal nickname “spaceship”), but I believe that its scope is much wider.
We will not just have a new operator - the semantics of comparisons will undergo significant changes at the level of the language itself.
Even if you can’t get anything else out of this article, remember this table:
Now we will have a new operator,
<=> , but, more importantly, the operators are now systematized. There are basic operators and there are derivative operators - each group has its own capabilities.
We will talk about these features briefly in the introduction and discuss them in more detail in the following sections.
Basic operators can be
inverted (i.e. rewritten with the reverse order of parameters). Derived statements can be
rewritten through the corresponding base statement. Neither converted nor rewritten candidates generate new functions; they are simply replacements at the source code level and are selected from an
extended set of candidates . For example, the expression
a <9 can now be evaluated as
a.operator <=> (9) <0 , and the expression
10! = B as
! Operator == (b, 10) . This means that it will be possible to get by with one or two operators where, in order to achieve the same behavior, you now need to manually write 2, 4, 6, or even 12 operators.
A brief overview of the rules will be presented below along with a table of all possible transformations.
Both basic and derivative operators can be defined as the
default ones . In the case of basic operators, this means that the operator will be applied to each member in the declaration order; in the case of derived operators, that rewritten candidates will be used.
It should be noted that there is no such transformation in which an operator of one kind (i.e., equality or ordering) could be expressed through an operator of another kind. In other words, the columns in our table are in no way dependent on each other. The expression
a == b will never be evaluated as
operator <=> (a, b) == 0 implicitly (but, of course, nothing prevents you from defining your
operator == via
operator <=> if you want).
Consider a small example in which we show how the code looks before and after applying the new functionality. We will write a string type that is not case sensitive,
CIString , whose objects can be compared both with each other and with
char const * .
In C ++ 17, for our task, we need to write 18 comparison functions:
class CIString { string s; public: friend bool operator==(const CIString& a, const CIString& b) { return assize() == bssize() && ci_compare(asc_str(), bsc_str()) == 0; } friend bool operator< (const CIString& a, const CIString& b) { return ci_compare(asc_str(), bsc_str()) < 0; } friend bool operator!=(const CIString& a, const CIString& b) { return !(a == b); } friend bool operator> (const CIString& a, const CIString& b) { return b < a; } friend bool operator>=(const CIString& a, const CIString& b) { return !(a < b); } friend bool operator<=(const CIString& a, const CIString& b) { return !(b < a); } friend bool operator==(const CIString& a, const char* b) { return ci_compare(asc_str(), b) == 0; } friend bool operator< (const CIString& a, const char* b) { return ci_compare(asc_str(), b) < 0; } friend bool operator!=(const CIString& a, const char* b) { return !(a == b); } friend bool operator> (const CIString& a, const char* b) { return b < a; } friend bool operator>=(const CIString& a, const char* b) { return !(a < b); } friend bool operator<=(const CIString& a, const char* b) { return !(b < a); } friend bool operator==(const char* a, const CIString& b) { return ci_compare(a, bsc_str()) == 0; } friend bool operator< (const char* a, const CIString& b) { return ci_compare(a, bsc_str()) < 0; } friend bool operator!=(const char* a, const CIString& b) { return !(a == b); } friend bool operator> (const char* a, const CIString& b) { return b < a; } friend bool operator>=(const char* a, const CIString& b) { return !(a < b); } friend bool operator<=(const char* a, const CIString& b) { return !(b < a); } };
In C ++ 20, you can do just 4 functions:
class CIString { string s; public: bool operator==(const CIString& b) const { return s.size() == bssize() && ci_compare(s.c_str(), bsc_str()) == 0; } std::weak_ordering operator<=>(const CIString& b) const { return ci_compare(s.c_str(), bsc_str()) <=> 0; } bool operator==(char const* b) const { return ci_compare(s.c_str(), b) == 0; } std::weak_ordering operator<=>(const char* b) const { return ci_compare(s.c_str(), b) <=> 0; } };
I’ll tell you what it all means, in more detail, but first, let's go back a bit and remember how comparisons worked up to the C ++ 20 standard.
Comparisons in Standards from C ++ 98 to C ++ 17
Comparison operations have not changed much since the creation of the language. We had six operators:
==,! = ,
< ,
> ,
<= And
> = . The standard defines each of them for built-in types, but in general they obey the same rules. When evaluating any
a @ b expression (where
@ is one of six comparison operators), the compiler looks for member functions, free functions, and built-in candidates named
operator @ , which can be called with type
A or
B in the specified order. The most suitable candidate is selected from them. That's all. In fact,
all the operators worked the same way: the operation
< did not differ from
<< .
Such a simple set of rules is easy to learn. All operators are absolutely independent and equivalent. It doesn't matter what we humans know about the fundamental relationship between the operations
== and
! = . In terms of language, this is one and the same. We use idioms. For example, we define the operator
! = Through
== :
bool operator==(A const&, A const&); bool operator!=(A const& lhs, A const& rhs) { return !(lhs == rhs); }
Similarly, through the operator
< we define all other relation operators. We use these idioms because, despite the rules of the language, we actually do not consider all six operators to be equivalent. We accept that two of them are basic (
== and
< ), and through them all the others are already expressed.
In fact, the Standard Template Library is built entirely on these two operators, and the vast number of types in the exploited code contain definitions of only one of them or both of them.
However, the
< operator is not very suitable for the base role for two reasons.
First, other relationship operators cannot be guaranteed to express through it. Yes,
a> b means exactly the same as
b <a , but it is not true that
a <= b means exactly the same as
! (B <a) . The last two expressions will be equivalent if there is a property of trichotomy, in which for any two values only one of the three statements is true:
a <b ,
a == b or
a> b . In the presence of trichotomy, the expression
a <= b means that we are dealing with either the first or second case ... and this is equivalent to the statement that we are not dealing with the third case. Therefore
(a <= b) ==! (A> b) ==! (B <a) .
But what if the attitude does not possess the property of trichotomy? This is characteristic of partial order relations. A classic example is floating point numbers for which any of the operations
1.f <NaN ,
1.f == NaN and
1.f> NaN gives
false . Therefore,
1.f <= NaN also gives a
lie , but at the same time
! (NaN <1.f) is
true .
The only way to implement the
<= operator in general terms through the basic operators is to paint both operations as
(a == b) || (a <b) , which is a big step backwards if we
still have to deal with linear order, since then not one function will be called, but two (for example, the expression
“abc..xyz9” <= “abc ..xyz1 " will have to be rewritten as
(" abc..xyz9 "==" abc..xyz1 ") || (" abc..xyz9 "<" abc..xyz1 ") and twice to compare the entire line).
Secondly, the operator
<is not very suitable for the role of the base one due to the peculiarities of its use in lexicographic comparisons. Programmers often make this mistake:
struct A { T t; U u; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u; } bool operator< (A const& rhs) const { return t < rhs.t && u < rhs.u; } };
To define the == operator for a collection of elements, it’s enough to apply
== to each member once, but this will not work with the
< operator. From the point of view of this implementation, the sets
A {1, 2} and
A {2, 1} will be considered equivalent (since none of them is less than the other). To fix this, apply the
< operator twice to each member except the last:
bool operator< (A const& rhs) const { if (t < rhs.t) return true; if (rhs.t < t) return false; return u < rhs.u; }
Finally, to guarantee the correct operation of comparisons of heterogeneous objects - i.e. to ensure that the expressions
a == 10 and
10 == a mean the same thing - they usually recommend writing comparisons as free functions. In fact, this is generally the only way to implement such comparisons. This is inconvenient because, firstly, you have to monitor compliance with this recommendation, and secondly, usually you have to declare such functions as hidden friends for a more convenient implementation (i.e. inside the class body).
Note that when comparing different types of objects it is not always necessary to write
operator == (X, int) ; they may also mean cases where
int can be implicitly cast to
X.
Let's summarize the rules to the C ++ 20 standard:
- All statements are handled the same way.
- We use idioms to facilitate implementation. The operators == and < we take for the basic idioms and express the remaining relationship operators through them.
- That's just the operator <is not very suitable for the role of the base.
- It is important (and recommended) to write comparisons of objects of different types as free functions.
New basic ordering operator: <=>
The most significant and noticeable change in the work of comparisons in C ++ 20 is the addition of a new operator -
operator <=> , a three-way comparison operator.
We are already familiar with three-way comparisons by the functions
memcmp /
strcmp in C and
basic_string :: compare () in C ++. They all return an
int value, which is represented by an arbitrary positive number if the first argument is greater than the second,
0 if they are equal, and an arbitrary negative number otherwise.
The “spaceship” operator does not return an
int value, but an object that belongs to one of the comparison categories, whose value reflects the type of relationship between the compared objects. There are three main categories:
- strong_ordering : a linear order relationship in which equality implies the interchangeability of elements (i.e. (a <=> b) == strong_ordering :: equal implies that f (a) == f (b) holds for all suitable functions f The term “suitable function” is intentionally not given a clear definition, but these do not include functions that return the addresses of their arguments or the capacity () of the vector, etc. We are only interested in the “essential” properties, which is also very vague, but can be conditionally assume that we are talking about the value of the type.The value of the vector is contained in it m elements, but not his address, etc.). This category includes the following values: strong_ordering :: greater , strong_ordering :: equal and strong_ordering :: less .
- weak_ordering : a linear order relation in which equality defines only a certain equivalence class. A classic example is case-insensitive string comparison, when two objects can be weak_ordering :: equivalent but are not equal in the strict sense (this explains the replacement of the word equal with equivalent in the value name).
- partial_ordering : partial order relation. In this category, one more value is added to the values greater , equivalent, and less (as in weak_ordering ) - unordered ("disordered"). It can be used to express partial order relations in a type system: 1.f <=> NaN gives the value partial_ordering :: unordered .
You will mainly work with the
strong_ordering category; This is also the optimal category for use by default. For example,
2 <=> 4 returns
strong_ordering :: less , and
3 <=> -1 returns strong_ordering :: greater .
Categories of a higher order can be implicitly reduced to categories of a weaker order (i.e.,
strong_ordering is reduced to
weak_ordering ). In this case, the current type of relationship is preserved (i.e.,
strong_ordering :: equal turns into
weak_ordering :: equivalent ).
The values of the comparison categories can be compared with literal
0 (not with any
int and not with
int equal to
0 , but simply with literal
0 ) using one of six comparison operators:
strong_ordering::less < 0
It is thanks to a comparison with the literal
0 that we can implement the relation operators:
a @ b is equivalent to
(a <=> b) @ 0 for each of these operators.
For example,
2 <4 can be calculated as
(2 <=> 4) <0 , which turns into
strong_ordering :: less <0 and gives the value
true .
The
<=> operator fits the role of the basic element much better than the
< operator, since it eliminates both problems of the latter.
First, the expression
a <= b is guaranteed to be equivalent to
(a <=> b) <= 0 even with partial ordering. For two unordered values,
a <=> b will give the value
partial_ordered :: unordered , and
partial_ordered :: unordered <= 0 will give
false , which is what we need. This is possible because
<=> can return more varieties of values: for example, the
partial_ordering category contains four possible values. A value of type
bool can only be
true or
false , so before we could not distinguish between comparisons of ordered and unordered values.
For clarity, consider an example of a partial order relationship that is not related to floating point numbers. Suppose we want to add an NaN state to an
int type, where NaN is just a value that does not form an ordered pair with any value involved. You can do this using
std :: optional to store it:
struct IntNan { std::optional<int> val = std::nullopt; bool operator==(IntNan const& rhs) const { if (!val || !rhs.val) { return false; } return *val == *rhs.val; } partial_ordering operator<=>(IntNan const& rhs) const { if (!val || !rhs.val) {
The
<= operator returns the correct value because now we can express more information at the level of the language itself.
Secondly, to get all the necessary information, it’s enough to apply
<=> once, which facilitates the implementation of lexicographic comparison:
struct A { T t; U u; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u; } strong_ordering operator<=>(A const& rhs) const {
See
P0515 , the original sentence for adding
operator <=>, for a more detailed
discussion .
New operator features
We do not just get at our disposal a new operator. In the end, if the example shown above with the declaration of structure
A only said that instead of
x <y we now have to write
(x <=> y) <0 every time, nobody would like it.
The mechanism for resolving comparisons in C ++ 20 is noticeably different from the old approach, but this change is directly related to the new concept of two basic comparison operators:
== and
<=> . If earlier it was an idiom (recording via
== and
< ), which we used, but which the compiler did not know about, now he will understand this difference.
Once again, I’ll give a table that you already saw at the beginning of the article:
Each of the basic and derivative operators received a new ability, which I will say a few words further.
Inversion of basic operators
As an example, take a type that can only be compared with
int :
struct A { int i; explicit A(int i) : i(i) { } bool operator==(int j) const { return i == j; } };
From the point of view of the old rules, it is not surprising that the expression
a == 10 works and evaluates to
a.operator == (10) .
But what about
10 == a ? In C ++ 17, this expression would be considered a clear syntax error. There is no such operator. For this code to work, you would have to write a symmetric
operator == , which would first take the value of
int , and then
A ... and to implement this would have to be in the form of a free function.
In C ++ 20, basic operators can be inverted. For
10 == a, the compiler will find the candidate
operator == (A, int) (in fact, this is a member function, but for clarity, I write it here as a free function), and then additionally - a variant with the reverse order of parameters, i.e. .
operator == (int, A) . This second candidate coincides with our expression (and ideally), so we will choose it. The expression
10 == a in C ++ 20 is evaluated as
a.operator == (10) . The compiler understands that equality is symmetrical.
Now we expand our type so that it can be compared with
int not only through the equality operator, but also through the ordering operator:
struct A { int i; explicit A(int i) : i(i) { } bool operator==(int j) const { return i == j; } strong_ordering operator<=>(int j) const { return i <=> j; } };
Again, the expression
a <=> 42 works fine and is calculated according to the old rules as
a.operator <=> (42) , but
42 <=> a would be wrong from the point of view of C ++ 17, even if the operator
< => already existed in the language. But in C ++ 20,
operator <=> , like
operator == , is symmetric: it recognizes inverted candidates. For
42 <=> a, a member function
operator <=> (A, int) will be found (again, I am writing it here as a free function just for greater clarity), as well as a synthetic candidate
operator <=> (int, A) . This reversed version exactly matches our expression - we select it.
However,
42 <=> a is NOT calculated as
a.operator <=> (42) . That would be wrong. This expression evaluates to
0 <=> a.operator <=> (42) . Try to figure out why this entry is correct.
It is important to note that the compiler does not create any new functions. When calculating
10 == a , the new operator
operator == (int, A) did not appear, and when calculating
42 <=> a ,
operator <=> (int, A) did not appear. Just two expressions are rewritten through inverted candidates. I repeat: no new functions are created.
Also note that the record with the reverse order of parameters is available only for basic operators, but for derivatives it is not. I.e:
struct B { bool operator!=(int) const; }; b != 42;
Rewriting Derived Operators
Let's go back to our example with structure
A :
struct A { int i; explicit A(int i) : i(i) { } bool operator==(int j) const { return i == j; } strong_ordering operator<=>(int j) const { return i <=> j; } };
Take the expression
a! = 17 . In C ++ 17, this is a syntax error because the
operator! = Operator does not exist. However, in C ++ 20, for expressions containing derivative comparison operators, the compiler will also search for the corresponding basic operators and express derivative comparisons through them.
We know that in mathematics, the operation
! = Essentially means NOT
== . Now this is known to the compiler. For the expression
a! = 17, he will look not only for the
operator! = Operators , but also
operator == (and, as in the previous examples, the inverted
operator == ). For this example, we found an equality operator that almost suits us - we just need to rewrite it in accordance with the desired semantics:
a! = 17 will be calculated as
! (A == 17) .
Similarly,
17! = A is calculated as
! A.operator == (17) , which is both a rewritten and an inverted version.
Similar transformations are also carried out for ordering operators. If we wrote
a <9 , we would try (unsuccessfully) to find
operator < , and also consider the basic candidates:
operator <=> . The corresponding replacement for the relation operators looks like this:
a @ b (where
@ is one of the relation operators) is calculated as
(a <=> b) @ 0 . In our case,
a.operator <=> (9) <0 . Similarly,
9 <= a is calculated as
0 <= a.operator <=> (9) .
Note that, as in the case of the call, the compiler does not create any new functions for the rewritten candidates. They are simply calculated differently, and all transformations are carried out only at the source code level.
The above leads me to the following advice:
BASIC OPERATORS ONLY : Define only basic operators (== and <=>) in your type.
Since the basic operators give the whole set of comparisons, it is enough to define only them. This means that you need only 2 operators to compare objects of the same type (instead of 6, as of now) and only 2 operators to compare different types of objects (instead of 12). If you only need the equality operation, then just write 1 function to compare objects of the same type (instead of 2) and 1 function to compare different types of objects (instead of 4). The
std :: sub_match class is an extreme case: in C ++ 17, it uses 42 comparison operators, and in C ++ 20 it uses only 8, while the functionality does not suffer in any way.
Since the compiler also considers inverted candidates, all these operators can be implemented as member functions. You no longer have to write free functions just for the sake of comparing objects of different types.
Special rules for finding candidates
As I already mentioned, the search for candidates for
a @ b in C ++ 17 was carried out according to the following principle: we find all
operator @ operators and select the most suitable one from them.
C ++ 20 uses an extended set of candidates. Now we will search all
operator @ . Let
@@ be the base operator for
@ (it can be the same operator). We also find all
operator @@ and for each of them we add its inverted version. From all these candidates found, we select the most suitable.
Note that operator overloading is permitted in
a single pass. We are not trying to substitute different candidates. First we collect them all, and then choose the best one from them. If this does not exist, the search, as before, fails.
Now we have much more potential candidates, and therefore more uncertainty. Consider the following example:
struct C { bool operator==(C const&) const; bool operator!=(C const&) const; }; bool check(C x, C y) { return x != y; }
In C ++ 17, we only had one candidate for
x! = Y , and now there are three:
x.operator! = (Y) ,! X.operator == (y) and
! Y.operator == (x) . What to choose? They are all the same! (Note: the candidate
y.operator! = (X) does not exist, since only basic operators can be
inverted .)
Two additional rules have been introduced to remove this uncertainty. Unconverted candidates are preferable to converts; . ,
x.operator!=(y) «»
!x.operator==(y) , «»
!y.operator==(x) . , «» .
:
operator@@ . . , .
-. — (,
x < y , —
(x <=> y) < 0 ), (,
x <=> y void - , DSL), . . ,
bool ( :
operator== bool , ?)
For example:
struct Base { friend bool operator<(const Base&, const Base&);
d1 < d2 :
#1 #2 . —
#2 , , , . ,
d1 < d2 (d1 <=> d2) < 0 . ,
void 0 — , . , - ,
#1 .
, , C++17, . , - . :
, . .
. , , , , , ( ). , :
« » , , ..
a < b 0 < (b <=> a) , , , .
C++17 . . :
struct A { T t; U u; V v; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u && v == rhs.v; } bool operator!=(A const& rhs) const { return !(*this == rhs); } bool operator< (A const& rhs) const {
-
std::tie() , .
, : :
struct A { T t; U u; V v; bool operator==(A const& rhs) const { return t == rhs.t && u == rhs.u && v == rhs.v; } strong_ordering operator<=>(A const& rhs) const {
.
<=> < . , .
c != 0 , , (
), .
. C++20 , :
struct A { T t; U u; V v; bool operator==(A const& rhs) const = default; strong_ordering operator<=>(A const& rhs) const = default; };
, . , :
struct A { T t; U u; V v; bool operator==(A const& rhs) const = default; auto operator<=>(A const& rhs) const = default; };
. , , :
struct A { T t; U u; V v; auto operator<=>(A const& rhs) const = default; };
, , . :
operator== ,
operator<=> .
C++20: . . , , , .
PVS-Studio , <=> . , -. , , (. "
"). ++ .
PVS-Studio <, :
bool operator< (A const& rhs) const { return t < rhs.t && u < rhs.u; }
. , - . .
:
Comparisons in C++20 .