Home

Awesome

Minimal GPT-NeoX-20B

This is a fairly minimal implementation of GPT-NeoX-20B in PyTorch. It is meant primarily as an educational/reference implementation, rather than an optimized or feature-full implementation.

GPT-NeoX-20B is a 20B-parameter autoregressive Transformer model developed by EleutherAI with the support of CoreWeave, trained using the GPT-NeoX library.

Some notes about the model:

Setup

Installation

Install PyTorch with your appropriate CUDA version, and then install from the requirements.txt (basically just tokenizers).

pip install -r requirements.txt

Download weights

Following the NeoX guide, download the model weights and tokenizer JSON file with the following command:

wget --cut-dirs=5 -nH -r --no-parent --reject "index.html*" https://mystic.the-eye.eu/public/AI/models/GPT-NeoX-20B/slim_weights/ -P 20B_checkpoints

You can also manually down them from here. Because of the size of the model, the model weights are broken into multiple files, based on the DeepSpeed save format.

Generate text

Here is some sample code to generate text. Note that since we are greedily decoding with no fancy tricks, there tends to be quite some repetitiion in generations.

import minimal20b
import torch
model = minimal20b.create_model(
    "/path/to/20B_checkpoints/global_step150000",
    use_cache=True,
    device="cuda:0",
)
tokenizer = minimal20b.create_tokenizer(
    "/path/to/20B_checkpoints/20B_tokenizer.json",
)
with torch.inference_mode():
    minimal20b.greedy_generate_text(
        model, tokenizer,
        "GPTNeoX20B is a 20B-parameter autoregressive Transformer model developed by EleutherAI.",
        max_seq_len=100,
    )

Evaluation

To run evaluation with the LM-eval-harness, you will need to install some additional dependencies (mostly just the eval harness library):

pip install -r scripts/eval/requirements.txt

Most datasets are automatically downloaded via Hugging Face datasets, but if you are evaluating on lambada, you will need to separately download the data.

mkdir -p data/lambada
wget http://eaidata.bmk.sh/data/lambada_test.jsonl -O data/lambada/lambada_test.jsonl

Then, you can run the following command.

python scripts/eval/eval_harness.py \
    --model_path /path/to/20B_checkpoints/global_step150000 \
    --tokenizer_path /path/to/20B_checkpoints/20B_tokenizer.json \
    --tasks lambada,anli_r1,anli_r2,anli_r3,wsc,winogrande,hellaswag,piqa
TaskMetricNeoX Impl (2 GPU)This Repo (1 GPU)
anli_r1acc0.32700.3300
acc_stderr0.01480.0149
anli_r2acc0.34100.3420
acc_stderr0.01500.0150
anli_r3acc0.35670.3617
acc_stderr0.01380.0139
hellaswagacc0.53510.5335
acc_stderr0.00500.0050
acc_norm0.71400.7126
acc_norm_stderr0.00450.0045
lambadaacc0.72110.7223
acc_stderr0.00620.0062
ppl3.67603.6559
ppl_stderr0.07600.0757
piqaacc0.77480.7758
acc_stderr0.00970.0097
acc_norm0.77860.7856
acc_norm_stderr0.00970.0096
winograndeacc0.65980.6598
acc_stderr0.01330.0133
wscacc0.50960.4808
acc_stderr0.04930.0492