Awesome
bfs-sycl-fpga
This work has been submitted accepted to 12th International Workshop on OpenCL and SYCL (IWOCL24)
The Breadth-First Search implementations memoryBFS and streamingBFS using Intel oneAPI (SYCL2020) on Intel FPGAs
- memoryBFS: We applied the typical optimisations proposed in the official guidelines alongside an automatic cache to achieve proper pipelining and improve random memory accesses performance. However, limitations occurred with fine-grained parallelism, and it was competitive only to some related work that utilised hardware-description languages or established high-level synthesis tools.
- streamingBFS: We added bit-level representations of data in memory, banking in on-chip memory, and fine-grained control over parallel data streams to achieve higher throughput.
Authors : Kaan Olgu & Tobias Kenter
Producing and Converting Datasets
Tested with Python 3.9.6
python -m venv venv-graphgen
. venv-graphgen/bin/activate
pip install -U pip
pip install numpy networkit pandas
pip install cython;
cd scripts
python genGraph.py rmat 24 16
If you have datasets ready as text file, convert graph text files to the bin files :
python generator.py (dataset name) (partition) (nnz/row)
python generator.py wiki-Talk 1 nnz
Build for Hardware Execution
cd *BFS
mkdir build
cd build
# for memoryBFS:
cmake .. -DFPGA_DEVICE=$AOCL_BOARD_PACKAGE_ROOT:$FPGA_BOARD_NAME -DNUM_COMPUTE_UNITS=4
# for streamingBFS
cmake .. -DFPGA_DEVICE=$AOCL_BOARD_PACKAGE_ROOT:$FPGA_BOARD_NAME -DNUM_COMPUTE_UNITS=4 -DK_MEMORY_CACHE=131072
make fpga
Build for Hardware Emulation
cd bfs_*cu
mkdir build
cd build
# for memoryBFS:
cmake .. -DFPGA_DEVICE=$AOCL_BOARD_PACKAGE_ROOT:$FPGA_BOARD_NAME -DNUM_COMPUTE_UNITS=4
# for streamingBFS
cmake .. -DFPGA_DEVICE=$AOCL_BOARD_PACKAGE_ROOT:$FPGA_BOARD_NAME -DNUM_COMPUTE_UNITS=4 -DK_MEMORY_CACHE=131072
make fpga_emu
AOCL Profiler
cd memoryBFS
aocl profile -output-dir /path/to/memoryBFS/aocl/ ./build/bfs.fpga (GraphName) (Partition) (RootNode)
Optimisations
Acknowledgements
Citing
Please cite our work via this reference:
Kaan Olgu, Tobias Kenter, Jose Nunez-Yanez, and Simon McIntosh-Smith. 2024. Optimisation and Evaluation of Breadth First Search with oneAPI/SYCL on Intel FPGAs: from Describing Algorithms to Describing Architectures. In International Workshop on OpenCL and SYCL (IWOCL ’24), April 8–11, 2024, Chicago, IL, USA. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3648115.3648134