Home

Awesome

llama.cpp.zig

llama.cpp bindings and utilities for zig. Currently targeting zig 0.11.x, there is high chance nightly works as well (0.12.0-dev.1856+94c63f31f when I checked) (using same branch, only few places have needed patching where @hasDecl was enough to support both versions).

Example usage

Clone: git clone --recursive https://github.com/Deins/llama.cpp.zig.git

  1. Download llama.cpp supported model (usually *.gguf format). For example this one.
  2. build and run with:
zig build run-simple -Doptimize=ReleaseFast -- --model_path path_to/model.gguf --prompt "Hello! I am AI, and here are the 10 things I like to think about:"

See examples/simple.zig

CPP samples

Subset of llama cpp samples have been included in build scripts. Use -Dcpp_samples option to install them.
Or run them directly, for example: zig build run-cpp-main -Dclblast -Doptimize=ReleaseFast -- -m path/to/model.gguf -p "hello my name is"

CLBlast acceleration

Clblast is supported by building it from source with zig. At moment only OpenCl backend has been tested. Cuda backend is not finished as I don't have nvidia hardware, pull requests are welcome.

Build:

Ideally just zig build -Dclblast .... It should work out of the box if you have installed GPUOpen/ocl. For other configurations you will need to find where OpenCL headers/libs are and pass them in using arguments zig build -Dclblast -Dopencl_includes="/my/path" -Dopencl_libs="/my/path/" Auto detection might be improved in future - let me know what opencl sdk you use.

Selecting GPU

With opencl backend main_gpu parameter is ignored. Insted you can set ids of GGML_OPENCL_PLATFORM GGML_OPENCL_DEVICE system enviroment variables. There is zig build -Dclblast run-opencl_devices utility available to print all opencl devices detected.