Awesome

llm.rs

Migration of Karpathy's llm.c project into Rust

The development steps taken to migrate llm.c into Rust

Using c2rust, train_gpt2.c was translated from Karpathy's llm.c project to Rust.

Although the transpilation of c2rust was successful, all the for loops have been turned into while loops.

Using GPT-4, we are able to convert all the while loops back into for loops.

Furthermore, using Mate, we converted some of these for loops into iter() functions using the Rayon library.

Currently, the project is undergoing manual updates to find performance improvements

Currently this implementation is still slower than the C version based on the following benchmarks:

	C	Rust	C++	Mojo
Intel Core i7-9700 8-core	2.447s	2.530s
Intel Xeon E5-2690 v3 12-core	2.110s	2.439s	1.037s	6.190s

LLM Training Results

Install python dependencies, output tokenized dataset, and load in the weights:

make setup

Run the training script:

make train

This will run cargo build --release from the llm-rs cargo project after which the binary will be copied into the main project folder.