Home

Awesome

zig_gpt2

GPT-2 inference engine written in Zig. Generation time: ~28ms per token.

Features:

How to Run:

Download the GPT-2 checkpoint from OpenAI.

python3 download_weights.py

Build the Zig binary and run it with a prompt to generate completions:

zig build -DOptimize=ReleaseFast
./zig-out/bin/zig_gpt2 "Marcus Aurelius said"

How to Test:

Generate test data by forwarding random tensors through PyTorch ops.

python3 generate_test_data.py

Run tests. Verifies Zig ops produce the same output as PyTorch.

zig build test

TODO

Implementation:

Efficiency: