Awesome
Automatic benchmark generation suite
Suite to generate benchmarks across different WaterLily versions, Julia versions (using juliaup), backends, cases, and cases sizes using the benchmark.sh script.
TL;DR
Default benchmark for quick validations on the TGV and Jelly test cases.
sh benchmark.sh
julia --project compare.jl
NOTE: For Ubuntu users: The default shell (
/bin/sh
) in Ubuntu points todash
, and notbash
. Executing thesh benchmark.sh
callsdash
and results in a[Bad Substitution Error](benchmark.sh: 2: Bad substitution)
. The simple workaround is to call the shell script directly./benchmark.sh
followed by the desired arguments. This applies to all the examples below.
More detailed usage example
sh benchmark.sh -v "release 1.11" -t "1 4" -b "Array CuArray" -c "tgv jelly" -p "6,7 5,6" -s "100 100" -ft "Float32 Float64"
julia --project compare.jl --data_dir="data" --plot_dir="plots" --patterns=["tgv","jelly"] --sort=1
runs both the TGV and jelly benchmarks (-c
) at the current WaterLily available in $WATERLILY_DIR
state in 2 different Julia versions (latest release version and latest 1.11, noting that these need to be available in juliaup), and 3 different backends (CPUx01, CPUx04, CUDA). Note that $WATERLILY_DIR
needs to be available in the environmental variables, or otherwise the argument -wd "my/waterlily/dir"
can be specified. The cases size -p
, number of time steps -s
, and float type -ft
are bash (ordered) arrays which need to be equally sized to -c
and specify each benchmark case (respectively).
The default benchmarks launch (sh benchmark.sh
) is equivalent to:
sh benchmark.sh -wd "$WATERLILY_DIR" -w "" -v "release" -t "4" -b "Array CuArray" -c "tgv jelly" -p "6,7 5,6" -s "100 100" -ft "Float32 Float32"
Note that -w
or --waterlily
can be used to pass different WaterLily versions by using commit hashes, tags, or branch names, eg. -w "master v1.2.0"
. An empty -w
argument will benchmark the current state of $WATERLILY_DIR
.
Benchmarks are then post-processed using the compare.jl
script. Plots can be generated by passing the --plot_dir
argument. Note that --patterns
can be passed to post-process only certain benchmarks. Alternatively, benchmark files can also be passed directly as arguments with
julia --project compare.jl --plot_dir="plots" $(find data/ \( -name "tgv*json" -o -name "jelly*json" \) -printf "%T@ %Tc %p\n" | sort -n | awk '{print $7}')
Usage information
The accepted command line arguments are (parenthesis for short version):
- Backend arguments:
--waterlily(-w)
,--waterlily_dir(-wd)
,--data_dir(-dd)
,--versions(-v)
,--backends(-b)
,--threads(-t)
. Respectively: List of WaterLily git hashes/tags/branches to test, WaterLily.jl directory, path to store the benchmark data, Julia version, backend types, number of threads (when--backends
containsArray
). The latter 3 arguments accept a list of different parameters, for example:
would generate benchmark for all these combinations of parameters.-w "fae590d e22ad41" -v "1.8.5 1.9.4" -b "Array CuArray" -t "1 6"
- Case arguments:
--cases(-c)
,--log2p(-p)
,--max_steps(-s)
,--ftype(-ft)
. The--cases
argument specifies which cases to benchmark, and it can be again a list of different cases. The name of the cases needs to be defined in benchmark.jl, for exampletgv
orjelly
. The current available cases are"tgv sphere cylinder donut jelly"
. Hence, to add a new case first define the function that returns aSimulation
in benchmark.jl, and then it can be called using the--cases(-c)
list argument. Case size, number of time steps, and float data type are then defined for each case (-p
,-s
,-ft
, respectively). All case arguments must have an equal length since each element of the array defines the case in different aspects.
The following command
sh benchmark.sh -v "release" -t "1 3 6" -b "Array CuArray" -c "tgv sphere" -p "6,7,8 5,6" -s "10 100" -ft "Float64 Float32"
would allow running benchmarks with 4 backends: CPUx01 (serial), CPUx03, CPUx06, GPU. Additionally, two benchmarks would be tested, tgv
and sphere
, with different sizes, number of time steps, and float type, each. This would result into 1 Julia version x (3 Array + 1 CuArray) backends x (3 TGV sizes + 2 jelly sizes) = 20 benchmarks.
Benchmarks are saved in JSON format with the following nomenclature: casename_sizes_maxsteps_ftype_backend_waterlilyHEADhash_juliaversion.json
. Benchmarks can be finally compared using compare.jl
as follows
julia --project compare.jl benchmark_1.json benchmark_2.json benchmark_3.json ...
or by using pattern syntax
julia --project compare.jl --data_dir="data" --patterns=["tgv*CPU"]
for which only TGV benchmarks on a CPU backend found in the "data"
directory would be processed. The following syntax would be produce equivalent results:
julia --project compare.jl $(find data -name "tgv*CPU.json" -printf "%T@ %Tc %p\n" | sort -n | awk '{print $7}') --sort=1
by taking the tgv
JSON files, sort them by creation time, and pass them as arguments to the compare.jl
program. Finally, note that the first benchmark passed as argument is taken as reference to compute speed-ups of other benchmarks: speedup_x = time(benchmark_1) / time(benchmark_x)
. The --sort=<1 to 8>
argument can also be used when running the comparison. It will sort the benchmark table rows by the values corresponding to the column index passed as argument. --sort=1
corresponds to sorting by backend. The baseline row is highlighted in blue, and the fastest run in a table is highlighted in green.