Binary compatibility: now or never

Translation of Titus Winters' publication in Working Group 21 (WG21) - C ++ Language Standardization Committee. The author discusses an important issue: support for backward binary compatibility or ABI (application binary interface).







Over the past years at WG21, I have actively promoted that progress is more important than backward compatibility. But I myself no longer believe in this, especially with regard to maintaining binary compatibility (ABI). In the last three releases (C ++ 14, C ++ 17 and C ++ 20) ABI was as stable as we were able to. Even if WG21 decides to break ABI backward compatibility in C ++ 23, we have been providing binary compatibility on many platforms for more than 10 years. In my opinion, the law of Hyrum dominates in large-scale alterations of software systems. Now you can’t tell how many users have the assumption of the stability of the ABI standard library (no matter how wise or how explicit or implicit) is firmly “sewn in the subcortex”, perhaps half of the C ++ developers in the world.







I keep a list of what WG21 could fix in the language if we decide to “break” the ABI. And I can’t say with confidence that the total cost of rework, which will entail the implementation of only this list, is comparable to the cost of violating ABI throughout the ecosystem. We saw many small improvements in the consistency of the API, the quality of the standard library code, etc., but no doubt there is not a single “breakthrough” change that would justify this cost for the average developer. Perhaps we would get better compliance of implementations with the standard, would give a chance to solve problems for implementations that do not comply with the standard specifications today. But not one improvement on my list is clearly worth the cost.







More importantly, due to ABI limitations, we cannot eliminate significant performance losses. We cannot get rid of the significant cost of passing unique_ptr by value [Chandler’s performance on CppCon 2019, to be published later], we cannot change std :: hash or the placement of the class in memory for unordered_map without forcing everyone to recompile everything everywhere. The performance of hashes has been extensively studied over the years and, taking into account the optimization of searches in the table and the hashing proper, we are confident that we can provide an unordered_map / std :: hash implementation that is API compatible and provides 200-300% performance boost . But ABI restrictions do not allow this. Additional studies on optimizing and tuning SSO for std :: string suggest a non-trivial increase in performance (1% in microbenchmarks and scaling) - the API is not affected, but ABI restrictions do not allow this.



The total loss of productivity exclusively blocked by ABI reaches several percentage points - possibly up to 5-10%. This is not something the ecosystem as a whole cannot do without, but it may not be acceptable for some organizations (Google among them). This, of course, is a big performance loss than is acceptable for C ++: remember that this is a language that claims that it leaves no room for a more productive competitor. Most users do not seem concerned about this performance degradation: there are other hash table implementations for those concerned about absolute performance. The general inefficiency associated with passing unique_ptr in value and other problems of the ABI language come to the fore in a very small number of tasks. Organizations that need maximum productivity can go their own way (and do it), using non-standard libraries and non-standard configuration tools. This is natural and it must be clearly understood.









A change in ABI will affect a relatively larger number of users. I suspect a significant proportion of these users do not suspect how strong their dependence on ABI is. In the ecosystem of Google’s servers, almost everything is assembled from source, there are few external dependencies and there is a better than average opportunity to undertake large-scale refactoring. But even for us, the recent ABI-breaking changes to the standard library cost 5-10 engineering years.

The total cost of breaking ABI backward compatibility for the entire C ++ ecosystem can be conservatively estimated in the “ Millennium Engineer ”: coordinating the rebuilding for each provider of plug-ins, .so or dll in the world will require enormous human resources. Together with the separation of the ecosystem due to C ++ 20 modules, changing the ABI in the development and implementation timeline of C ++ 23 can lead to a hard separation of the ecosystem.









There are many questions that cannot be answered with this discussion. How long can we continue to the point where changing the ABI from just useful will become a critical necessity? If we explicitly choose ABI stability support, how expensive will the change be when and when such a critical need arises? If security issues like Specter and Meltdown require a change to the calling convention, how much will C ++ cost to overcome this milestone? What proportion of developers use C ++ because we claim to put performance above everything else? Worse: how long can C ++ claim to be the fastest language and not engage in such optimizations?







If we consciously cannot or do not want to change the ABI, then this decision must be voiced loudly. We must clearly say that this is a language that puts ABI stability above the last few percent of productivity. I am willing to argue that in practice this has been the case over the past few years. We need to let users know what to expect from us, and let them know that libraries like Boost, Folly or Absail are expected to make the right choice if performance is needed. But this does not help in any way with such restrictions related to ABI in the language itself as the cost of transmitting unique_ptr. The standard library retains significance in this development model: the standard library is what we use for compatibility and stability. This may require a change in focus and direction of development: we may want to design for more flexibility in changing conditions, and not for “clean” performance.







If we argue that performance is more important than ABI stability, we must immediately decide when exactly we will “break” backward compatibility and do everything possible so that the ecosystem accepts such changes. And clearly and loudly declare that we are going this way. You need to understand that the more time passes between such changes, the more expensive they will become - because over time there will be more and more unsupported dependence on ABI. Our “implementers” made it very clear that compatibility-breaking C ++ 11 changes were painful and expensive. The desire to avoid the repetition of such costs is natural, but you need to choose: either we do not repeat them, because we do not change the ABI, or by making the costs less.







In essence, there are three possibilities for WG21:







  1. Deciding in which release the ABI will be changed does not matter in C ++ 23 or C ++ 26. Warn people and immediately develop tools and diagnostics to help identify places that will break. Focuses on more consistent practices and tools to support future ABI changes. It is not in the interests of a particular implementer to expose its users to the consequences of changing the ABI, if other implementations do not, changing the ABI should be a coordinated activity for the benefit of future users. Ideally, you need to break everything - to make it clear that code compiled in C ++ 23 mode is incompatible with code compiled in previous modes. If someone can do without rebuilding, and others will have errors in the layout or at run time - this will only increase misunderstanding and disappointment.
  2. Decide that we strive for ABI stability by formalizing today's practice. This has been the case for many years when standard implementers had the right to veto breaking ABI changes - we already set ABI backward compatibility above design cleanliness and performance. If we recognize this and tell users clearly, the ecosystem will be better. The value of additional libraries will grow for those who need to squeeze the last drops of performance, but do not require the stability provided by the standard. Other performance-oriented languages ​​may challenge our position in the future.
  3. Not being able to choose a direction and save the status quo. For me, this is the worst case scenario: we continue to continue to implicitly pay more attention to ABI backward compatibility. We say “performance” and vote “binary compatibility”. Such dissonance harms the ecosystem and implies a lack of agreement on the priorities of the language. I sincerely hope that through the efforts of implementers and DG, we will reach the necessary consensus.


I believe that option No. 1 is better suited for users who need maximum performance, but it has an incredible cost to the ecosystem and can lead to fragmentation of the language in the future. Option # 2 is a boring, responsible, and decent choice: it’s sad to admit that we painted ourselves in the corner of the room and try to minimize the losses associated with this. Choosing option No. 3 means not managing, and I pray that this will be avoided: any explicit choice is better than the current dissonance and inability to reach agreement on the choice of long-term goals.



I understand that we have reached our present position through many small acts of seemingly reasonable inaction. Over the past 10 years, not a single change has been made that could justify a violation of binary compatibility, but the implicit policy of maintaining backward compatibility has become destructive for the ecosystem. However, by explicitly adopting such a policy, we will open up another possibility for C ++ to gradually leave the stage: you cannot be a system-oriented, performance-oriented language, leaving so much room for a more productive language. In theory, each vendor can decide to “break” ABI in any future release, but the general line of thought seems different. I am sure that discussion and consensus between the implementers of the standard and WG21 are required: what priorities should I choose?








All Articles