Home

Awesome

The Darmstadt Automotive Parallel HeterogeNEous (DAPHNE) Benchmark-Suite

This suite contains automotive benchmarks used for the evaluation of heterogeneous, parallel programming models. They are extracted from the Autoware project and should represent parallelizable workloads from the automotive field.

It contains 3 kernels (euclidean_cluster, ndt_matching and points2image) and four different implementations for each kernel: -Serial version -OpenMP version -OpenMP Offloading version -CUDA version -OpenCL version -OpenCL version using atomics

Build the benchmarks either with a) the toplevel Makefile. It defaults to build all architectures and allows to choose from opencl, cpu, openmp, or cuda.

b) a Makefile within one of the architecture folders in the src directory.

c) a Makefile in a specific kernel directory (ndt_matching, points2image, or euclidean_clustering).

Depending on the platform and environment it can be necessary to configure path variables in the Makefile.

To run a benchmark, execute the ./kernel executable in each testcase directory.

Input- and golden reference data resides in the subdirectory 'data'. In this package, we provide the "minimal" data-set. We provide additional data-sets "small", "medium" and "full" with different number of testcases/sizes via download from TODO. Note that you need to unzip the files in the "data" directory before you run the kernels.

FPGA Implementation

For the points2image benchmark, this release also contains an OpenCL implementation targeting Xilinx FPGAs (Zynq Ultrascale+ ZCU102) with Xilinx SDSoC/SDAccel. Due to the peculiarities of the setup, the long compilation times and the additional licenses required, building for FPGA is not include in the top-level Makefile.

Compatibility Overview

The benchmarks have been developed for a large number of platforms in mind. However compatibility out of the box can not be guaranteed. Since some platforms require manual configuration or do not support the technology the following list provides information about which results to expect.

BenchmarkConsumer DesktopLinux WorkstationNvidia Jetson TX 2
CPU/points2imageokokok
CPU/euclidean_clusterokokok
CPU/ndt_mappingokokok
Cuda/points2imageokok (1)ok (2)
Cuda/euclidean_clusterok (1)ok (2)
Cuda/ndt_mappingok (4)ok (4)(1)ok (4)(2)
OpenCl/points2imageokokok (5)
OpenCl/euclidean_clusterokokok (5)
OpenCl/ndt_mappingok (4)ok (4)not supported (5)(6)
OpenCl-atomic/points2imageokokok (5)
OpenCl-atomic/euclidean_clusterokokok (5)
OpenCl-atomic/ndt_mappingok (4)ok (4)ok (5)(4)
OpenMP/points2imageokokok
OpenMP/eulidean_clusterokokok
OpenMP/ndt_mappingokokok (4)
OpenMP-offload/points2imageok (CPU)ok (CPU), ok (CUDA)ok (CPU)
OpenMP-offload/euclidean_clusterok (CPU)ok (CPU), ok (CUDA)not supported (8)
OpenMP-offload/ndt_mappingnot supported (CPU)(7)ok (CPU), ok (CUDA)(4)not supported (8)

(1) Compute Capability set to 6.0 or lower

(2) Compute Capability set to 6.2 or lower

(3) Results outside error tolerances

(4) Results not accurate but inside error tolerances

(5) Running on POCL with CUDA support

(6) cl_khr_int64_base_atomics not supported

(7) internal compiler error

(8) undeclared omp_target functions

Data

The test-data (input and reference output data) is available in four different sizes:

To use the data, just unpack the contained archive. Each archive contains input- and output data for all three benchmarks.

Publications about/using DAPHNE