Home

Awesome

<div align="center"> <h1 align="center">RAG-GPT</h1> Quickly launch an intelligent customer service system with Flask, LLM, RAG, including frontend, backend, and admin console. <br/> <a href="https://langchain-bot.open-sora.ai/" target="_blank"> Live Demo </a> <br/> <img style="display: block; margin: auto; width: 70%;" src="./doc/rag_overview.jpg"> </div>

Contents

Features

Online Retrieval Architecture

<div align="center"> <img style="display: block; margin: auto; width: 100%;" src="./doc/online_retrieve.jpg"> </div>

Deploy the RAG-GPT Service

Step 1: Download repository code

Clone the repository:

git clone https://github.com/open-kf/rag-gpt.git && cd rag-gpt

Step 2: Configure variables of .env

Before starting the RAG-GPT service, you need to modify the related configurations for the program to initialize correctly.

Using OpenAI as the LLM base

cp env_of_openai .env

The variables in .env

LLM_NAME="OpenAI"
OPENAI_API_KEY="xxxx"
GPT_MODEL_NAME="gpt-4o-mini"
MIN_RELEVANCE_SCORE=0.4
BOT_TOPIC="xxxx"
URL_PREFIX="http://127.0.0.1:7000/"
USE_PREPROCESS_QUERY=1
USE_RERANKING=1
USE_DEBUG=0
USE_LLAMA_PARSE=0
LLAMA_CLOUD_API_KEY="xxxx"
USE_GPT4O=0

Using ZhipuAI as the LLM base

If you cannot use OpenAI's API services, consider using ZhipuAI as an alternative.

cp env_of_zhipuai .env

The variables in .env

LLM_NAME="ZhipuAI"
ZHIPUAI_API_KEY="xxxx"
GLM_MODEL_NAME="glm-4-air"
MIN_RELEVANCE_SCORE=0.4
BOT_TOPIC="xxxx"
URL_PREFIX="http://127.0.0.1:7000/"
USE_PREPROCESS_QUERY=1
USE_RERANKING=1
USE_DEBUG=0
USE_LLAMA_PARSE=0
LLAMA_CLOUD_API_KEY="xxxx"

Using DeepSeek as the LLM base

If you cannot use OpenAI's API services, consider using DeepSeek as an alternative.

[!NOTE] DeepSeek does not provide an Embedding API, so here we use ZhipuAI's Embedding API.

cp env_of_deepseek .env

The variables in .env

LLM_NAME="DeepSeek"
ZHIPUAI_API_KEY="xxxx"
DEEPSEEK_API_KEY="xxxx"
DEEPSEEK_MODEL_NAME="deepseek-chat"
MIN_RELEVANCE_SCORE=0.4
BOT_TOPIC="xxxx"
URL_PREFIX="http://127.0.0.1:7000/"
USE_PREPROCESS_QUERY=1
USE_RERANKING=1
USE_DEBUG=0
USE_LLAMA_PARSE=0
LLAMA_CLOUD_API_KEY="xxxx"

Using Moonshot as the LLM base

If you cannot use OpenAI's API services, consider using Moonshot as an alternative.

[!NOTE] Moonshot does not provide an Embedding API, so here we use ZhipuAI's Embedding API.

cp env_of_moonshot .env

The variables in .env

LLM_NAME="Moonshot"
ZHIPUAI_API_KEY="xxxx"
MOONSHOT_API_KEY="xxxx"
MOONSHOT_MODEL_NAME="moonshot-v1-8k"
MIN_RELEVANCE_SCORE=0.4
BOT_TOPIC="xxxx"
URL_PREFIX="http://127.0.0.1:7000/"
USE_PREPROCESS_QUERY=1
USE_RERANKING=1
USE_DEBUG=0
USE_LLAMA_PARSE=0
LLAMA_CLOUD_API_KEY="xxxx"

Using local LLMs

If your knowledge base involves sensitive information and you prefer not to use cloud-based LLMs, consider using Ollama to deploy large models locally.

[!NOTE] First, refer to ollama to Install Ollama, and download the embedding model mxbai-embed-large and the LLM model such as llama3.

cp env_of_ollama .env

The variables in .env

LLM_NAME="Ollama"
OLLAMA_MODEL_NAME="xxxx"
OLLAMA_BASE_URL="http://127.0.0.1:11434"
MIN_RELEVANCE_SCORE=0.4
BOT_TOPIC="xxxx"
URL_PREFIX="http://127.0.0.1:7000/"
USE_PREPROCESS_QUERY=1
USE_RERANKING=1
USE_DEBUG=0
USE_LLAMA_PARSE=0
LLAMA_CLOUD_API_KEY="xxxx"

Step 3: Deploy RAG-GPT

Deploy RAG-GPT using Docker

[!NOTE] When deploying with Docker, pay special attention to the host of URL_PREFIX in the .env file. If using Ollama, also pay special attention to the host of OLLAMA_BASE_URL in the .env file. They need to use the actual IP address of the host machine.

docker-compose up --build

Deploy RAG-GPT from source code

[!NOTE] Please use Python version 3.10.x or above.

Set up the Python running environment

It is recommended to install Python-related dependencies in a Python virtual environment to avoid affecting dependencies of other projects.

Create and activate a virtual environment

If you have not yet created a virtual environment, you can create one with the following command:

python3 -m venv myenv

After creation, activate the virtual environment:

source myenv/bin/activate
Install dependencies with pip

Once the virtual environment is activated, you can use pip to install the required dependencies.

pip install -r requirements.txt
Create SQLite Database

The RAG-GPT service uses SQLite as its storage DB. Before starting the RAG-GPT service, you need to execute the following command to initialize the database and add the default configuration for admin console.

python3 create_sqlite_db.py
Start the service

If you have completed the steps above, you can try to start the RAG-GPT service by executing the following command.

python3 rag_gpt_app.py
sh start.sh

[!NOTE]

Configure the admin console

Login to the admin console

Access the admin console through the link http://your-server-ip:7000/open-kf-admin/ to reach the login page. The default username and password are admin and open_kf_AIGC@2024 (can be checked in create_sqlite_db.py).

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-2.jpg"> </div>

After logging in successfully, you will be able to see the configuration page of the admin console.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-3.jpg"> </div>

On the page http://your-server-ip:7000/open-kf-admin/#/, you can set the following configurations:

Import your data

Import websites

After submitting the website URL, once the server retrieves the list of all web page URLs via crawling, you can select the web page URLs you need as the knowledge base (all selected by default). The initial Status is Recorded.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-12.jpg"> </div>

You can actively refresh the page http://your-server-ip:7000/open-kf-admin/#/source in your browser to get the progress of web page URL processing. After the content of the web page URL has been crawled, and the Embedding calculation and storage are completed, you can see the corresponding Size in the admin console, and the Status will also be updated to Trained.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-13.jpg"> </div>

Clicking on a webpage's URL reveals how many sub-pages the webpage is divided into, and the text size of each sub-page.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-10.jpg"> </div>

Clicking on a sub-page allows you to view its full text content. This will be very helpful for verifying the effects during the experience testing process.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-11.jpg"> </div>

Import isolated urls

Collect the URLs of the required web pages. You can submit up to 10 web page URLs at a time, and these pages can be from different domains.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-14.jpg"> </div>

Import local files

Upload the required local files. You can upload up to 10 files at a time, and each file cannot exceed 30MB. The following file types are currently supported: [".txt", ".md", ".pdf", ".epub", ".mobi", ".html", ".docx", ".pptx", ".xlsx", ".csv"].

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-15.jpg"> </div>

Test the chatbot

After importing website data in the admin console, you can experience the chatbot service through the link http://your-server-ip:7000/open-kf-chatbot/.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-6.jpg"> </div>

Embed on your website

Through the admin console link http://your-server-ip:7000/open-kf-admin/#/embed, you can see the detailed tutorial for configuring the iframe in your website.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-7.jpg"> </div> <div align="center"> <br/> <a href="https://docs.openim.io/" target="_blank"> OpenIM chatbot </a> <br/> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-8.jpg"> </div>

Dashboard of user's historical request

Through the admin console link http://your-server-ip:7000/open-kf-admin/#/dashboard, you can view the historical request records of all users within a specified time range.

<div align="center"> <img style="display: block; margin: auto; width: 70%;" src="./doc/screenshot-9.jpg"> </div>

The frontend of admin console and chatbot

The RAG-GPT service integrates 2 frontend modules, and their source code information is as follows:

admin console

Code Repository

An intuitive web-based admin interface for Smart QA Service, offering comprehensive control over content, configuration, and user interactions. Enables effortless management of the knowledge base, real-time monitoring of queries and feedback, and continuous improvement based on user insights.

chatbot

Code Repository

An HTML5 interface for Smart QA Service designed for easy integration into websites via iframe, providing users direct access to a tailored knowledge base without leaving the site, enhancing functionality and immediate query resolution.