Home

Awesome

OpenVINO Japanese chatbot demo (youri-7b-chat)

Description

This is an LLM Japanese chatbot demo program using Intel OpenVINO toolkit. The demo uses rinna/youri-7b-chat model developed by Rinna Co.,Ltd. This program can use either one of 'CPU', 'GPU.0' (integrated GPU), and 'GPU.1' Intel discrete GPU for inferencing.

Requirement

Install prerequisites

Recommend to use Python virtual env.<br> You need to have Python installed.

# (optional) create a python venv and enable it.
python -m venv venv
(win) venv/Scripts/activate
(Lnx) source venv/bin/activate

pip install -r requirements0-uninstall.txt
pip install -r requirements1-install.txt

Files

filedescription
config.ymlThe system setting file. You can specify huggingface model cache directory and device for inferencing
youri-7b-chat-openvino.pyWebUI chatbot demo using rinna/youri-7b-chat model. This program uses OpenVINO as inference engine.<br>You need to run model_download.py to download and convert the model into OpenVINO IR before you run this demo.
model_download.pyThis program downloads rinna/youri-7b-chat model from huggingface and convert it into FP16 OpenVINO IR model.<br>The converted model will be stored in ./youri-7b-chat/FP16/ directory. Also, the original model downloaded from hugging face will be stored in ./cache/huggingface/hub/ directory.
benchmark_pyt.pySimple benchmark program using PyTorch.
benchmark_ov.pySimple benchmark program using OpenVINO.

Demo screenshot

example

Test environment

Special thanks

Rinna Co.,Ltd - Original developer of rinna/youri-7b-chat model.