Awesome
warmup_stats
This system has various scripts for analysing the output of VM benchmarking. Any system can be used to perform the benchmarking, e.g. Krun.
Build
Run ./build.sh
to build warmup_stats.
Basic usage
User should directly call the bin/warmup_stats
, which is a front-end to other
scripts in bin/
. warmup_stats
takes either CSV files or
Krun results files as input. As output it can
create HTML or LaTeX / PDF tables and diffs, or PDF plots.
Terminology
warmup_stats uses the following terminology:
-
A process execution is the execution of a single operating system process. In other words, it is equivalent to running a program from the command-line and waiting for it to terminate.
-
An in-process iteration is a single iteration of a benchmark within a process execution. In other words, a single process execution executes many in-process iterations.
CSV format
The bin/warmup_stats
script can take CSV files as input. The format is as
follows. The first row must contain a header for each required column: the
first column must be the process execution index (conventionally a number
0...n
, though this is not enforced), the second column the benchmark name,
and n subsequent columns for the n in-process iterations run (each a time
in seconds). Each of these columns can be given arbitrary names, but the order
is vital, and the number of in-process iterations must be the same for each
process execution. An example is as follows:
process_exec_idx, bench_name, 0, 1, 2, ...
0, spectral norm, 0.2, 0.1, 0.4, ...
1, spectral norm, 0.3, 0.15, 0.2, ...
When processing CSV with warmup_stats
, the --language
, --vm
and
--uname
flags must be specified so that plots can contain the relevant
information, though users can pass arbitrary data to each flag. Note that
these flags are not needed with Krun results files.
Creating plots
The --output-plots <file.pdf>
flag converts input data into visual plots.
If the input files are in CSV format, bin/warmup_stats
also needs the names of
the language and VM under test, and the output of uname -a
on the machine the
benchmarks were run on.
Example usage:
bin/warmup_stats --output-plots plots.pdf -l javascript -v V8 -u "`uname -a`" results.csv
bin/warmup_stats --output-plots plots.pdf results.json.bz2
Creating tables
The --output-table <file>
flag converts input data into an HTML table or a
LaTeX / PDF table. Conversion to PDF requires pdflatex
to be installed.
If the input files are in CSV format, bin/warmup_stats
also needs the names of
the language and VM under test, and the output of uname -a
on the machine the
benchmarks were run on.
Example usage (LaTeX / PDF):
bin/warmup_stats --tex --output-table table.tex -l javascript -v V8 -u "`uname -a`" results.csv
bin/warmup_stats --tex --output-table table.tex results.json.bz2
Example usage (HTML):
bin/warmup_stats --html --output-table table.html -l javascript -v V8 -u "`uname -a`" results.csv
bin/warmup_stats --html --output-table table.html results.json.bz2
By default, warmup_stats
produces high quality statistics, which can take
considerable time. If you want to quickly experiment with things, you can use
the --quality low
switch: this makes warmup_stats
run considerably quicker,
but does lead to lower quality (and thus less reliable) statistics being
produced. Although the differences are often fairly minor, we do not encourage
the use of --quality low
when formally publishing benchmark results.
Creating diffs
Benchmarking is often performed in order to test whether a change in a given VM improves or worsens its performance. Unfortunately, the difference between benchmark performance before and after a change is rarely simple. Users will want to produce a detailed comparison of the results in Krun results tables (above) in order to get a deeper insight into the effects of their changes.
The --output-diff
flag converts data from exactly two CSV files into an HTML
table or a LaTeX / PDF table. Conversion to PDF requires pdflatex
to be
installed.
If the input files are in CSV format, bin/warmup_stats
also needs the names of
the language and VM under test, and the output of uname -a
on the machine the
benchmarks were run on.
Example usage (LaTeX / PDF):
bin/warmup_stats --tex --output-diff diff.tex -l javascript -v V8 -u "`uname -a`" before.csv after.csv
bin/warmup_stats --tex --output-diff diff.tex before.json.bz2 after.json.bz2
Example usage (HTML):
bin/warmup_stats --html --output-diff diff.html -l javascript -v V8 -u "`uname -a`" before.csv after.csv
bin/warmup_stats --html --output-diff diff.html before.json.bz2 after.json.bz2
The resulting table will contain results from the after.{csv,json.bz2}
file,
compared against the before.{csv,json.bz2}
file. VMs and benchmarks that do
not appear in both CSV results files will be omitted from the table.