Home

Awesome

Accompanying code for paper “Multiway Powersort”

This is the accompanying C++ code for the article “Multiway Powersort” by William Cawley Gelling, Markus Nebel, Benjamin Smith and Sebastian Wild. It contains efficient implementations of nearly-optimal adaptive mergesort variants, in particular Powersort and Peeksort as well as variants thereof for 4-way merging.

To reproduce the raw data used in the running-time study in the paper, run the experiments.sh Bash script. The code assumes a typical Unix development environment and uses cmake. The code is optimized for compilation with g++ (from the GNU Compiler Collection), but should also work with clang++ (from LLVM).

The cachegrind cache simulations need valgrind and callgrind to be installed.

The largest inputs (100M 16-byte objects) use up to 10 GB of main memory (buffers are not shared across algorithms); comment these runs out in experiments.sh if you do not have sufficient memory. The full set of experiments runs for ~1 day.

Figures

All figures in the paper are created using Wolfram Mathematica; the code is in the notebook create-plots.nb.

A quick description on how to manually recreate the plots when Mathematica is not an option appears below.

Figure 3 (running time over n, ints, random runs)

Uses the averages and standard deviations from times-runs-int.out.

Figure 4 (running time distribution, 10^7 ints, random runs)

Uses the raw data from times-runs-10m-int-dist-xxxxx.csv (where xxxxx is the timestamp of the run and a parameter summary). The charts are created using Mathematica's DistributionChart command.

Figure 5 (running time over n, records, random runs)

As Fig. 3, but with the data from times-runs-l+p.out.

Figure 6 (running time over n, ints, random permutations)

As Fig. 3, but but with the data from times-rp-int.out.

Figure 7 (scanned elements over n, random runs)

Uses the columns merge-cost and buffer-cost from times-runs-*-cmps.csv files; the scanned element count is computed as:

scanned elements = 2 * merge-cost + 2 * buffer-cost

Then the averages and standard deviations are plotted.

Figure 8 (merge cost over n, random runs)

Like Fig. 7, but using only column merge-cost.

Table 1 (cachegrind results)

The numbers in Table 1 can be read off using cg_annotate or KCachegrind.

List of implemented sorting algorithms

All algorithms are implemented as a C++ template. Apart from the iterator type, most have further parameters to fine-tune the algorithm.