Home

Awesome

Intro

This document describes a stable bottom-up adaptive merge sort named twinsort.

The primary goal is to provide a simplified alternative to quadsort.

The algorithm is about 150 lines of sparse C code that should be relatively easy to understand and port.

Twin swap

The twin_swap function is a pre-sorting routine which turns the array into sorted blocks of 2 elements. 8 9 4 1 2 3 6 0 5 7 becomes 8 9 1 4 2 3 0 6 5 7.

The twin swap routine also contains a reverse order run detector, so 6 5 4 3 2 1 is sorted into 1 2 3 4 5 6 rather than 5 6 3 4 1 2.

The swap and the run detection are carried out simultaneously, subsequently the reverse run detector has very little overhead.

Tail sort

The tail_sort function is a bottom-up merge sort which uses at most n / 2 swap memory.

It merges by copying the right block to swap memory and merging the tails of each block. The routine skips unnecessary merge operations and boundary checks.

The tail merge routine also contains a forward run detector with minimal overhead.

Performance

Overall twinsort is slightly faster than Timsort, most notably on random data.

Big O

NameBestAverageWorstMemoryStablePartitioning
twinsortnn log nn log nnyesno

Twinsort makes n comparisons when the data is already sorted or reverse sorted.

Data types

Twinsort only supports integers, this to keep things simple.

Visualization

In the visualization below eight sorts are performed on various distributions. These distributions are: Random, Ascending, Ascending Saw, Generic, Descending, Descending Saw, Random Tail, and Wave order.

The upper half shows the swap memory and the bottom half shows the main memory. Colors are used to differentiate between skip, swap, and merge operations.

tailsort visualization

Benchmarks

The following benchmark is based on the wolfsort benchmark on WSL gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1).

The source code was compiled using g++ -O3 -w -fpermissive bench.c. Each test was ran 100 times and both the best and average run time is reported. stablesort is c++'s std::stablesort.

NameItemsTypeBestAverageRepetitionsDistribution
stablesort100000i320.0060840.0061741random order
quadsort100000i320.0058380.0059201random order
timsort100000i320.0075560.0076771random order
twinsort100000i320.0062650.0063381random order
stablesort100000i320.0006580.0007351ascending
quadsort100000i320.0000570.0000571ascending
timsort100000i320.0000450.0000451ascending
twinsort100000i320.0001260.0001301ascending
stablesort100000i320.0013600.0014671ascending saw
quadsort100000i320.0008520.0008741ascending saw
timsort100000i320.0008510.0008741ascending saw
twinsort100000i320.0008830.0009091ascending saw
stablesort100000i320.0038910.0039711generic order
quadsort100000i320.0037630.0038551generic order
timsort100000i320.0054630.0055691generic order
twinsort100000i320.0040900.0041541generic order
stablesort100000i320.0009040.0009411descending order
quadsort100000i320.0000660.0000671descending order
timsort100000i320.0001010.0001031descending order
twinsort100000i320.0000730.0000741descending order
stablesort100000i320.0010540.0010951descending saw
quadsort100000i320.0004880.0005131descending saw
timsort100000i320.0004640.0004781descending saw
twinsort100000i320.0004150.0004271descending saw
stablesort100000i320.0020400.0021391random tail
quadsort100000i320.0015570.0015881random tail
timsort100000i320.0019790.0020341random tail
twinsort100000i320.0017300.0017641random tail
stablesort100000i320.0009780.0010201wave order
quadsort100000i320.0008370.0009101wave order
timsort100000i320.0008340.0010811wave order
twinsort100000i320.0009470.0009961wave order