Awesome
OpenVINO Japanese chatbot demo (youri-7b-chat)
Description
This is an LLM Japanese chatbot demo program using Intel OpenVINO toolkit. The demo uses rinna/youri-7b-chat
model developed by Rinna Co.,Ltd.
This program can use either one of 'CPU', 'GPU.0' (integrated GPU), and 'GPU.1' Intel discrete GPU for inferencing.
Requirement
- At least main memory of 32GB. 64GB is recommended.
Install prerequisites
Recommend to use Python virtual env.<br> You need to have Python installed.
# (optional) create a python venv and enable it.
python -m venv venv
(win) venv/Scripts/activate
(Lnx) source venv/bin/activate
pip install -r requirements0-uninstall.txt
pip install -r requirements1-install.txt
Files
file | description |
---|---|
config.yml | The system setting file. You can specify huggingface model cache directory and device for inferencing |
youri-7b-chat-openvino.py | WebUI chatbot demo using rinna/youri-7b-chat model. This program uses OpenVINO as inference engine.<br>You need to run model_download.py to download and convert the model into OpenVINO IR before you run this demo. |
model_download.py | This program downloads rinna/youri-7b-chat model from huggingface and convert it into FP16 OpenVINO IR model.<br>The converted model will be stored in ./youri-7b-chat/FP16/ directory. Also, the original model downloaded from hugging face will be stored in ./cache/huggingface/hub/ directory. |
benchmark_pyt.py | Simple benchmark program using PyTorch. |
benchmark_ov.py | Simple benchmark program using OpenVINO. |
Demo screenshot
Test environment
- System 1
- Software
- Windows 11 23H2
- OpenVINO 2023.1.0
- Hardware
- CPU: Core i7-10700K
- MEM: 64GB
- GPU: Intel A380 (discrete GPU is optional)
- Software
- System 2
- Software
- Ubuntu 22.04.3
- OpenVINO 2023.1.0
- Hardware
- CPU: Xeon Platinum 8358
- MEM: 256GB
- Software
Special thanks
Rinna Co.,Ltd - Original developer of rinna/youri-7b-chat model.