Awesome
gpuocelot
NOTE: this fork is not actively maintained anymore.
About
This is a fork of the gpuocelot code to make it work on a modern system, in combination with the CUDA library for Julia.
As far as the functionality and tests in CUDA.jl go, this is a drop-in replacement for the official CUDA driver.
Short overview of changes compared to upstream:
- Fix compilation on a modern system (
yyparse
hack) - Support for Clang
- Add/extend/fix certain API calls (
cuLaunchKernel
,cuCtxDestroy
,cuLaunch
,cuDeviceGetName
) - Small bugfixes
Not that this fork happened before gpuocelot was available on GitHub, so there might be some differences between both code bases.
Requirements
- Scons
- CUDA toolkit <= 5.0 (you only need the toolkit, not the actual driver)
- Appropriate GCC for the CUDA toolkit you picked
- LLVM 3.5
- Boost
Compilation
Make sure you do a recursive check-out, so the hydrazine submodule is checked out too.
Example compilation:
cd $(CHECKOUT_DIR)
CUDA_BIN_PATH=/opt/cuda-5.0/bin CUDA_LIB_PATH=/opt/cuda-5.0/lib CUDA_INC_PATH=/opt/cuda-5.0/include \
CC=clang CXX=clang++ python2 build.py \
--install -p $(PREFIX) -j$(JOBS)
Note: if your main gcc
binary is not compatible with your CUDA toolkit
version (for example, CUDA 5.0 requires gcc <= 4.6), you will need to edit
ocelot/scripts/build_environment.py
and change the two nvcc
invocations to
include -ccbin=gcc-4.6
(or something similar).
Note: due to restrictions of the build system, make sure you only use
absolute paths when passing information through environment variables, and
always build with --install
.
Note: LLVM needs to be at version 3.5. If your distribution provides some
other version, install and compile LLVM from source and point the build system
to that installation's llvm-config
binary by setting the LLVM_CONFIG
environment variable before invoking build.py
.
Usage
Compile and install gpuocelot into a directory you can load libraries from.
Next, you either rename libocelot.so
to libcuda.so
, or you use something
which knows about gpuocelot (like CUDA.jl does). After that, you can use the
available symbols just as it were the official NVIDIA implementation.