Awesome

Llama.cpp Build Parameters

Llama.cpp was built from git hash: a40f2b656fab364ce0aff98dbefe9bd9c3721cc9

With the following build commands:

mkdir build
cd build/
cmake .. -DLLAMA_CUBLAS=ON -DLLAMA_CUDA_DMMV_X=64 -DLLAMA_CUDA_MMV_Y=2 -DLLAMA_CUDA_F16=true -DBUILD_SHARED_LIBS=ON
cd ..
cmake --build build --config Release -j --verbose

Then the .so or .lib file was copied into the Libraries directory and all the .h files were copied to the Includes directory. In Windows you should put the build/bin/llama.dll into Binaries/Win64 directory.

You will need to have CUDA 12.2 installed or you will have an error loading the "UELlama" Module, this is because the llama.dll was compiled with that CUDA version, if you want to switch the version you will re-compile the binary.