Awesome
ID-based RAG FastAPI
Overview
This project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.
Files are organized into embeddings by file_id
. The primary use case is for integration with LibreChat, but this simple API can be used for any ID-based use case.
The main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.
The API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.
Features
- Document Management: Methods for adding, retrieving, and deleting documents.
- Vector Store: Utilizes Langchain's vector store for efficient document retrieval.
- Asynchronous Support: Offers async operations for enhanced performance.
Setup
Getting Started
- Configure
.env
file based on section below - Setup pgvector database:
- Run an existing PSQL/PGVector setup, or,
- Docker:
docker compose up
(also starts RAG API)- or, use docker just for DB:
docker compose -f ./db-compose.yaml up
- or, use docker just for DB:
- Run API:
- Docker:
docker compose up
(also starts PSQL/pgvector)- or, use docker just for RAG API:
docker compose -f ./api-compose.yaml up
- or, use docker just for RAG API:
- Local:
- Make sure to setup
DB_HOST
to the correct database hostname - Run the following commands (preferably in a virtual environment)
- Make sure to setup
- Docker:
pip install -r requirements.txt
uvicorn main:app
Environment Variables
The following environment variables are required to run the application:
-
RAG_OPENAI_API_KEY
: The API key for OpenAI API Embeddings (if using default settings).- Note:
OPENAI_API_KEY
will work butRAG_OPENAI_API_KEY
will override it in order to not conflict with LibreChat setting.
- Note:
-
RAG_OPENAI_BASEURL
: (Optional) The base URL for your OpenAI API Embeddings -
RAG_OPENAI_PROXY
: (Optional) Proxy for OpenAI API Embeddings -
VECTOR_DB_TYPE
: (Optional) select vector database type, default topgvector
. -
POSTGRES_DB
: (Optional) The name of the PostgreSQL database, used whenVECTOR_DB_TYPE=pgvector
. -
POSTGRES_USER
: (Optional) The username for connecting to the PostgreSQL database. -
POSTGRES_PASSWORD
: (Optional) The password for connecting to the PostgreSQL database. -
DB_HOST
: (Optional) The hostname or IP address of the PostgreSQL database server. -
DB_PORT
: (Optional) The port number of the PostgreSQL database server. -
RAG_HOST
: (Optional) The hostname or IP address where the API server will run. Defaults to "0.0.0.0" -
RAG_PORT
: (Optional) The port number where the API server will run. Defaults to port 8000. -
JWT_SECRET
: (Optional) The secret key used for verifying JWT tokens for requests.- The secret is only used for verification. This basic approach assumes a signed JWT from elsewhere.
- Omit to run API without requiring authentication
-
COLLECTION_NAME
: (Optional) The name of the collection in the vector store. Default value is "testcollection". -
CHUNK_SIZE
: (Optional) The size of the chunks for text processing. Default value is "1500". -
CHUNK_OVERLAP
: (Optional) The overlap between chunks during text processing. Default value is "100". -
RAG_UPLOAD_DIR
: (Optional) The directory where uploaded files are stored. Default value is "./uploads/". -
PDF_EXTRACT_IMAGES
: (Optional) A boolean value indicating whether to extract images from PDF files. Default value is "False". -
DEBUG_RAG_API
: (Optional) Set to "True" to show more verbose logging output in the server console, and to enable postgresql database routes -
CONSOLE_JSON
: (Optional) Set to "True" to log as json for Cloud Logging aggregations -
EMBEDDINGS_PROVIDER
: (Optional) either "openai", "bedrock", "azure", "huggingface", "huggingfacetei" or "ollama", where "huggingface" uses sentence_transformers; defaults to "openai" -
EMBEDDINGS_MODEL
: (Optional) Set a valid embeddings model to use from the configured provider.- Defaults
- openai: "text-embedding-3-small"
- azure: "text-embedding-3-small" (will be used as your Azure Deployment)
- huggingface: "sentence-transformers/all-MiniLM-L6-v2"
- huggingfacetei: "http://huggingfacetei:3000". Hugging Face TEI uses model defined on TEI service launch.
- ollama: "nomic-embed-text"
- bedrock: "amazon.titan-embed-text-v1"
-
RAG_AZURE_OPENAI_API_VERSION
: (Optional) Default is2023-05-15
. The version of the Azure OpenAI API. -
RAG_AZURE_OPENAI_API_KEY
: (Optional) The API key for Azure OpenAI service.- Note:
AZURE_OPENAI_API_KEY
will work butRAG_AZURE_OPENAI_API_KEY
will override it in order to not conflict with LibreChat setting.
- Note:
-
RAG_AZURE_OPENAI_ENDPOINT
: (Optional) The endpoint URL for Azure OpenAI service, including the resource.- Example:
https://YOUR_RESOURCE_NAME.openai.azure.com
. - Note:
AZURE_OPENAI_ENDPOINT
will work butRAG_AZURE_OPENAI_ENDPOINT
will override it in order to not conflict with LibreChat setting.
- Example:
-
HF_TOKEN
: (Optional) if needed forhuggingface
option. -
OLLAMA_BASE_URL
: (Optional) defaults tohttp://ollama:11434
. -
ATLAS_SEARCH_INDEX
: (Optional) the name of the vector search index if using Atlas MongoDB, defaults tovector_index
-
MONGO_VECTOR_COLLECTION
: Deprecated for MongoDB, please useATLAS_SEARCH_INDEX
andCOLLECTION_NAME
-
AWS_DEFAULT_REGION
: (Optional) defaults tous-east-1
-
AWS_ACCESS_KEY_ID
: (Optional) needed for bedrock embeddings -
AWS_SECRET_ACCESS_KEY
: (Optional) needed for bedrock embeddings
Make sure to set these environment variables before running the application. You can set them in a .env
file or as system environment variables.
Use Atlas MongoDB as Vector Database
Instead of using the default pgvector, we could use Atlas MongoDB as the vector database. To do so, set the following environment variables
VECTOR_DB_TYPE=atlas-mongo
ATLAS_MONGO_DB_URI=<mongodb+srv://...>
COLLECTION_NAME=<vector collection>
ATLAS_SEARCH_INDEX=<vector search index>
The ATLAS_MONGO_DB_URI
could be the same or different from what is used by LibreChat. Even if it is the same, the $COLLECTION_NAME
collection needs to be a completely new one, separate from all collections used by LibreChat. In addition, create a vector search index for collection above (remember to assign $ATLAS_SEARCH_INDEX
) with the following json:
{
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "cosine",
"type": "vector"
},
{
"path": "file_id",
"type": "filter"
}
]
}
Follow one of the four documented methods to create the vector index.
Cloud Installation Settings:
AWS:
Make sure your RDS Postgres instance adheres to this requirement:
The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.
In order to setup RDS Postgres with RAG API, you can follow these steps:
-
Create a RDS Instance/Cluster using the provided AWS Documentation.
-
Login to the RDS Cluster using the Endpoint connection string from the RDS Console or from your IaC Solution output.
-
The login is via the Master User.
-
Create a dedicated database for rag_api:
create database rag_api;
. -
Create a dedicated user\role for that database:
create role rag;
-
Switch to the database you just created:
\c rag_api
-
Enable the Vector extension:
create extension vector;
-
Use the documentation provided above to set up the connection string to the RDS Postgres Instance\Cluster.
Notes:
- Even though you're logging with a Master user, it doesn't have all the super user privileges, that's why we cannot use the command:
create role x with superuser;
- If you do not enable the extension, rag_api service will throw an error that it cannot create the extension due to the note above.
Dev notes:
Installing pre-commit formatter
Run the following commands to install pre-commit formatter, which uses black code formatter:
pip install pre-commit
pre-commit install