Home

Awesome

BLOOM parallel test

DIRTY solution

install transformers branch: thomas/dirty_bloom_tp

pip -e git+https://github.com/huggingface/transformers.git@thomas/add_custom_kernels#egg=transformers

Alternatively, For the custom kernel:

git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout thomas/add_custom_kernels
python setup.py build_ext --inplace # Might have to edit `setup.py` to remove the torch import
pip install -e .

RUN

This will require redis to be installed on the machine. Redis is the easiest way to communicate through pubsub to all the various processes without causing too much issues for NCCL or the webserver threading/circuit breaking model.

python -m torch.distributed.run --nproc_per_node=8 generate.py --name bigscience/bloom --max-input-tokens=1000 --save-path=/data/models/
python server.py

USE

curl -X POST -d '{"inputs": "This is a test", "parameters": {"max_new_tokens": 20, "temperature": 0.4}}' http://localhost:8000/generate -H "content-type: application/json"