Home

Awesome

<div align="center">

kotaemon

An open-source clean & customizable RAG UI for chatting with your documents. Built with both end users and developers in mind.

Preview

<a href="https://trendshift.io/repositories/11607" target="_blank"><img src="https://trendshift.io/api/badge/repositories/11607" alt="Cinnamon%2Fkotaemon | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

Live Demo | Online Install | User Guide | Developer Guide | Feedback | Contact

Python 3.10+ Code style: black <a href="https://github.com/Cinnamon/kotaemon/pkgs/container/kotaemon" target="_blank"> <img src="https://img.shields.io/badge/docker_pull-kotaemon:latest-brightgreen" alt="docker pull ghcr.io/cinnamon/kotaemon:latest"></a> download <a href='https://huggingface.co/spaces/cin-model/kotaemon-demo'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a> <a href="https://hellogithub.com/en/repository/d3141471a0244d5798bc654982b263eb" target="_blank"><img src="https://abroad.hellogithub.com/v1/widgets/recommend.svg?rid=d3141471a0244d5798bc654982b263eb&claim_uid=RLiD9UZ1rEHNaMf&theme=small" alt="Featured|HelloGitHub" /></a>

</div> <!-- start-intro -->

Introduction

This project serves as a functional RAG UI for both end users who want to do QA on their documents and developers who want to build their own RAG pipeline. <br>

+----------------------------------------------------------------------------+
| End users: Those who use apps built with `kotaemon`.                       |
| (You use an app like the one in the demo above)                            |
|     +----------------------------------------------------------------+     |
|     | Developers: Those who built with `kotaemon`.                   |     |
|     | (You have `import kotaemon` somewhere in your project)         |     |
|     |     +----------------------------------------------------+     |     |
|     |     | Contributors: Those who make `kotaemon` better.    |     |     |
|     |     | (You make PR to this repo)                         |     |     |
|     |     +----------------------------------------------------+     |     |
|     +----------------------------------------------------------------+     |
+----------------------------------------------------------------------------+

For end users

For developers

Key Features

Preview

Installation

If you are not a developer and just want to use the app, please check out our easy-to-follow User Guide. Download the .zip file from the latest release to get all the newest features and bug fixes.

System requirements

  1. Python >= 3.10
  2. Docker: optional, if you install with Docker
  3. Unstructured if you want to process files other than .pdf, .html, .mhtml, and .xlsx documents. Installation steps differ depending on your operating system. Please visit the link and follow the specific instructions provided there.

With Docker (recommended)

  1. We support both lite & full version of Docker images. With full, the extra packages of unstructured will be installed as well, it can support additional file types (.doc, .docx, ...) but the cost is larger docker image size. For most users, the lite image should work well in most cases.

    • To use the lite version.

      docker run \
      -e GRADIO_SERVER_NAME=0.0.0.0 \
      -e GRADIO_SERVER_PORT=7860 \
      -p 7860:7860 -it --rm \
      ghcr.io/cinnamon/kotaemon:main-lite
      
    • To use the full version.

      docker run \
      -e GRADIO_SERVER_NAME=0.0.0.0 \
      -e GRADIO_SERVER_PORT=7860 \
      -p 7860:7860 -it --rm \
      ghcr.io/cinnamon/kotaemon:main-full
      
  2. We currently support and test two platforms: linux/amd64 and linux/arm64 (for newer Mac). You can specify the platform by passing --platform in the docker run command. For example:

    # To run docker with platform linux/arm64
    docker run \
    -e GRADIO_SERVER_NAME=0.0.0.0 \
    -e GRADIO_SERVER_PORT=7860 \
    -p 7860:7860 -it --rm \
    --platform linux/arm64 \
    ghcr.io/cinnamon/kotaemon:main-lite
    
  3. Once everything is set up correctly, you can go to http://localhost:7860/ to access the WebUI.

  4. We use GHCR to store docker images, all images can be found here.

Without Docker

  1. Clone and install required packages on a fresh python environment.

    # optional (setup env)
    conda create -n kotaemon python=3.10
    conda activate kotaemon
    
    # clone this repo
    git clone https://github.com/Cinnamon/kotaemon
    cd kotaemon
    
    pip install -e "libs/kotaemon[all]"
    pip install -e "libs/ktem"
    
  2. Create a .env file in the root of this project. Use .env.example as a template

    The .env file is there to serve use cases where users want to pre-config the models before starting up the app (e.g. deploy the app on HF hub). The file will only be used to populate the db once upon the first run, it will no longer be used in consequent runs.

  3. (Optional) To enable in-browser PDF_JS viewer, download PDF_JS_DIST then extract it to libs/ktem/ktem/assets/prebuilt

<img src="https://raw.githubusercontent.com/Cinnamon/kotaemon/main/docs/images/pdf-viewer-setup.png" alt="pdf-setup" width="300">
  1. Start the web server:

    python app.py
    
    • The app will be automatically launched in your browser.
    • Default username and password are both admin. You can set up additional users directly through the UI.

    Chat tab

  2. Check the Resources tab and LLMs and Embeddings and ensure that your api_key value is set correctly from your .env file. If it is not set, you can set it there.

Setup GraphRAG

[!NOTE] Official MS GraphRAG indexing only works with OpenAI or Ollama API. We recommend most users to use NanoGraphRAG implementation for straightforward integration with Kotaemon.

<details> <summary>Setup Nano GRAPHRAG</summary> </details> <details> <summary>Setup LIGHTRAG</summary> </details> <details> <summary>Setup MS GRAPHRAG</summary> </details>

Setup Local Models (for local/private RAG)

See Local model setup.

Setup multimodal document parsing (OCR, table parsing, figure extraction)

These options are available:

Select corresponding loaders in Settings -> Retrieval Settings -> File loader

Customize your application

flowsettings.py

This file contains the configuration of your application. You can use the example here as the starting point.

<details> <summary>Notable settings</summary>
# setup your preferred document store (with full-text search capabilities)
KH_DOCSTORE=(Elasticsearch | LanceDB | SimpleFileDocumentStore)

# setup your preferred vectorstore (for vector-based search)
KH_VECTORSTORE=(ChromaDB | LanceDB | InMemory | Qdrant)

# Enable / disable multimodal QA
KH_REASONINGS_USE_MULTIMODAL=True

# Setup your new reasoning pipeline or modify existing one.
KH_REASONINGS = [
    "ktem.reasoning.simple.FullQAPipeline",
    "ktem.reasoning.simple.FullDecomposeQAPipeline",
    "ktem.reasoning.react.ReactAgentPipeline",
    "ktem.reasoning.rewoo.RewooAgentPipeline",
]
</details>

.env

This file provides another way to configure your models and credentials.

<details> <summary>Configure model via the .env file</summary>

Adding your own RAG pipeline

Custom Reasoning Pipeline

  1. Check the default pipeline implementation in here. You can make quick adjustment to how the default QA pipeline work.
  2. Add new .py implementation in libs/ktem/ktem/reasoning/ and later include it in flowssettings to enable it on the UI.

Custom Indexing Pipeline

(more instruction WIP).

<!-- end-intro -->

Citation

Please cite this project as

@misc{kotaemon2024,
    title = {Kotaemon - An open-source RAG-based tool for chatting with any content.},
    author = {The Kotaemon Team},
    year = {2024},
    howpublished = {\url{https://github.com/Cinnamon/kotaemon}},
}

Star History

<a href="https://star-history.com/#Cinnamon/kotaemon&Date"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Cinnamon/kotaemon&type=Date&theme=dark" /> <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Cinnamon/kotaemon&type=Date" /> <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Cinnamon/kotaemon&type=Date" /> </picture> </a>

Contribution

Since our project is actively being developed, we greatly value your feedback and contributions. Please see our Contributing Guide to get started. Thank you to all our contributors!

<a href="https://github.com/Cinnamon/kotaemon/graphs/contributors"> <img src="https://contrib.rocks/image?repo=Cinnamon/kotaemon" /> </a>