Awesome

Automatic benchmark generation suite

Suite to generate benchmarks across different WaterLily versions, Julia versions (using juliaup), backends, cases, and cases sizes using the benchmark.sh script.

TL;DR

Default benchmark for quick validations on the TGV and Jelly test cases.

sh benchmark.sh
julia --project compare.jl

NOTE: For Ubuntu users: The default shell (/bin/sh) in Ubuntu points to dash, and not bash. Executing the sh benchmark.sh calls dash and results in a [Bad Substitution Error](benchmark.sh: 2: Bad substitution). The simple workaround is to call the shell script directly ./benchmark.sh followed by the desired arguments. This applies to all the examples below.

More detailed usage example

sh benchmark.sh -v "release 1.11" -t "1 4" -b "Array CuArray" -c "tgv jelly" -p "6,7 5,6" -s "100 100" -ft "Float32 Float64"
julia --project compare.jl --data_dir="data" --plot_dir="plots" --patterns=["tgv","jelly"] --sort=1

runs both the TGV and jelly benchmarks (-c) at the current WaterLily available in $WATERLILY_DIR state in 2 different Julia versions (latest release version and latest 1.11, noting that these need to be available in juliaup), and 3 different backends (CPUx01, CPUx04, CUDA). Note that $WATERLILY_DIR needs to be available in the environmental variables, or otherwise the argument -wd "my/waterlily/dir" can be specified. The cases size -p, number of time steps -s, and float type -ft are bash (ordered) arrays which need to be equally sized to -c and specify each benchmark case (respectively). The default benchmarks launch (sh benchmark.sh) is equivalent to:

sh benchmark.sh -wd "$WATERLILY_DIR" -w "" -v "release" -t "4" -b "Array CuArray" -c "tgv jelly" -p "6,7 5,6" -s "100 100" -ft "Float32 Float32"

Note that -w or --waterlily can be used to pass different WaterLily versions by using commit hashes, tags, or branch names, eg. -w "master v1.2.0". An empty -w argument will benchmark the current state of $WATERLILY_DIR.

Benchmarks are then post-processed using the compare.jl script. Plots can be generated by passing the --plot_dir argument. Note that --patterns can be passed to post-process only certain benchmarks. Alternatively, benchmark files can also be passed directly as arguments with

julia --project compare.jl --plot_dir="plots" $(find data/ \( -name "tgv*json" -o -name "jelly*json" \) -printf "%T@ %Tc %p\n" | sort -n | awk '{print $7}')

Usage information

The accepted command line arguments are (parenthesis for short version):

Backend arguments: --waterlily(-w), --waterlily_dir(-wd), --data_dir(-dd), --versions(-v), --backends(-b), --threads(-t). Respectively: List of WaterLily git hashes/tags/branches to test, WaterLily.jl directory, path to store the benchmark data, Julia version, backend types, number of threads (when --backends contains Array). The latter 3 arguments accept a list of different parameters, for example:
```
-w "fae590d e22ad41" -v "1.8.5 1.9.4" -b "Array CuArray" -t "1 6"
```
would generate benchmark for all these combinations of parameters.
Case arguments: --cases(-c), --log2p(-p), --max_steps(-s), --ftype(-ft). The --cases argument specifies which cases to benchmark, and it can be again a list of different cases. The name of the cases needs to be defined in benchmark.jl, for example tgv or jelly. The current available cases are "tgv sphere cylinder donut jelly". Hence, to add a new case first define the function that returns a Simulation in benchmark.jl, and then it can be called using the --cases(-c) list argument. Case size, number of time steps, and float data type are then defined for each case (-p, -s, -ft, respectively). All case arguments must have an equal length since each element of the array defines the case in different aspects.

The following command

sh benchmark.sh -v "release" -t "1 3 6" -b "Array CuArray" -c "tgv sphere" -p "6,7,8 5,6" -s "10 100" -ft "Float64 Float32"

would allow running benchmarks with 4 backends: CPUx01 (serial), CPUx03, CPUx06, GPU. Additionally, two benchmarks would be tested, tgv and sphere, with different sizes, number of time steps, and float type, each. This would result into 1 Julia version x (3 Array + 1 CuArray) backends x (3 TGV sizes + 2 jelly sizes) = 20 benchmarks.

Benchmarks are saved in JSON format with the following nomenclature: casename_sizes_maxsteps_ftype_backend_waterlilyHEADhash_juliaversion.json. Benchmarks can be finally compared using compare.jl as follows

julia --project compare.jl benchmark_1.json benchmark_2.json benchmark_3.json ...

or by using pattern syntax

julia --project compare.jl --data_dir="data" --patterns=["tgv*CPU"]

for which only TGV benchmarks on a CPU backend found in the "data" directory would be processed. The following syntax would be produce equivalent results:

julia --project compare.jl $(find data -name "tgv*CPU.json" -printf "%T@ %Tc %p\n" | sort -n | awk '{print $7}') --sort=1

by taking the tgv JSON files, sort them by creation time, and pass them as arguments to the compare.jl program. Finally, note that the first benchmark passed as argument is taken as reference to compute speed-ups of other benchmarks: speedup_x = time(benchmark_1) / time(benchmark_x). The --sort=<1 to 8> argument can also be used when running the comparison. It will sort the benchmark table rows by the values corresponding to the column index passed as argument. --sort=1 corresponds to sorting by backend. The baseline row is highlighted in blue, and the fastest run in a table is highlighted in green.