In continuation of the topic I want to share my code, which overtakes std::sort()
from the current versions of the GNU C ++ Library and (approximately, there is no exact data) repeats the result of "Sort Alexandrescu" with CppCon 2019 .
In
(non- C++
) C++
it took a reasonable effort to get the sort no worse than std::sort()
to get rid of the overhead of using the library qsort()
. Including, therefore, use macros instead of templates.
In turn, if you sort the “mice” and not the “elephants”, then the costs of qsort()
quite large: extra address arithmetic and an indirect call to the comparator function.
According to the available information, this combination of algorithms and implementation features is faster than many other options in a practical sense:
C++
It is very likely that this option is slightly faster and / or slightly slower than the vast majority of sorts, but finding out is literally a titanic work that I can not afford.
It is interesting if someone compares this option with the current options in Tarantool, PostgreSQL, SQLite and MySQL. I hope kaamos will not be able to pass by with its SysBench .
In the first comment from RPG18, there was a link to a recent performance by Andrei Alexandrescu “Speed Is Found In The Minds of People" , where he brings to a fairly similar idea, but goes to heap_sort closer to the final.
The performance seemed a little prolonged to me (if olegbunin had given 90 minutes at least once ...), but the numbers are not enough. In particular, I want to see the sorting behavior with increasing N
, since an increase in the QuickSort completion threshold leads to acceleration at large sizes and deceleration to small ones, etc.
Nevertheless, judging by the figures that Alexandrescu cites, the described option suddenly gives a similar acceleration. However, until I found the code shown to Alexandrescu in its finished form, to "take and compare", and there is no time to encode by video (write if you find or do it).
The theoretical and ideological side of the "algorithm" is quite simple:
Log2(N)
It is worth noting that depending on the processor architecture and application conditions, you can increase the gain on long sequences up to 10-15% by choosing SORT_THRESHOLD
within 128-256. However, this slows down the processing of sequences with a reverse order and close to it.
Nevertheless, this is a good bonus if you understand that the reverse order is unlikely in your data, or if you can cheaply detect such cases and execute a branch with a small SORT_THRESHOLD
.
PS
The described sorting option was “redone” for my libmdbx project (fast embedded key-value database with ACID), in which the other day README and API description were updated (actually rewritten). Therefore, I will be grateful both for correcting typos and for advice and suggestions. Himself, as a rule, is not visible lack of some information.