Awesome

Deploying AutoRAG with Kotaemon Tutorial

In this tutorial, we’ll guide you on how to deploy AutoRAG using Kotaemon to create a functional chat UI. With this guide, you can utilize an optimized RAG system through AutoRAG and experience it in a seamless chat interface.

Watch the Result Video

Tutorial Outline

Optimize RAG using AutoRAG.
Run the API server from the optimized RAG.
Deploy the AutoRAG x Kotaemon web app on fly.io.
Connect and use the API server in the web app.

Prerequisites

Git installed on your system
Homebrew (for macOS users)
fly.io account
Completion of optimization using AutoRAG

Step 1: Optimizing RAG with AutoRAG

First, find an optimized RAG pipeline. Check out this tutorial for instructions on optimizing with AutoRAG.

Step 2: Running the AutoRAG API Server

To run the AutoRAG API server locally, use the following command:

autorag run_api --trial_dir /trial/dir/0 --host 0.0.0.0 --port 8000

The trial directory is a subdirectory within your project directory post-optimization, typically named with a “number.” Specify the directory name to be used as the backend for the chat interface.

For public access to the API server, AutoRAG uses NGrok. Upon server startup, you can find the public URL in the logs:

INFO     [api.py:199] >> Public API URL:          api.py:199
<https://8a31-14-52-132-205.ngrok-free.app>

NGrok URL

Make sure to remember the URL displayed in the terminal.

Step 3: Deploying AutoRAG-Kotaemon

First, clone the AutoRAG Kotaemon repository:

git clone https://github.com/vkehfdl1/AutoRAG-web-kotaemon.git
cd AutoRAG-web-kotaemon

Then proceed to [fly.io]:

Install the Fly.io CLI tool:

brew install flyctl

This is for macOS users.

For other operating systems, refer to here.

Authenticate with Fly.io:

fly auth login

Deploy on Fly.io:

fly launch

Fly.io deployment

Set up the deployment as shown above. You can set Region, Name, etc., as desired.

Note: The initial deployment may take around 10-15 minutes.

Also, a minimum of 1GB memory is recommended for smooth operation.

Once deployed, you’ll see the Fly URL. If you don’t see it in the CLI, you can find it in the Fly.io dashboard. Clicking on it will open Kotaemon’s initial setup screen.

Step 5: Configuring Kotaemon

Kotaemon initial setup

Upon first launch, you’ll see the initial setup screen as shown above. Here, you can set your OpenAI API Key or Cohere API key, or proceed without setting one by pressing the red button.

Without setting an API key, you won’t be able to use the “Automatic Conversation Title” feature. For private data, avoid setting an API key and proceed to the next step by pressing the red button.

Kotaemon login

Next, you’ll see the login screen. For the first run, set both the ID and password to admin. This will allow you to use the service without issues.

After logging in, be sure to enter the Settings tab at the top left and go to the Reasoning settings tab.

API Endpoint setup

In the AutoRAG API Endpoint URL tab, enter the API server URL you noted down earlier. Ensure it ends with .app and do not add a / at the end.

Finally, press the Save Changes button!

Step 6: Try It Out

Now you can use the optimized RAG pipeline with Kotaemon as shown below!

Kotaemon chat interface

Stopping Deployment on Fly.io

Since Fly.io is a paid service, it’s best to stop deployment when not in use.

To stop an application on Fly.io, use:

fly scale count 0