Home

Awesome

Tiny Semantic Caching

Description:

Semantic Caching is an In-Memory Database that support Semantic Search (Vector Search), it can be used in many different applications like RAG (Retrieval Augmented Generation), Database Assistant, and many more.. Designing a high performance applications that uses LLMs requires handling alot of issues like Time-Complexity, and avoidance of repeatable calls. Semantic Caching can help and save time and computational resources when designing applications like this. Tiny Semantic Caching is a project that uses Ollama and Vector Search in Duckdb to create complete semantic caching cycle.

Prerequisities:

Project Setup:

poetry install
ollama pull nomic-embed-text

make sure to update model name/embedding size in .env file if you used other embedding model.

## use this directly
poetry run uvicorn main:app --reload

## or use this to activate the environment first
poetry shell
## then test the API
uvicorn main:app --reload

use the following URL to test the functionalities http:localhost:8000/docs

### build the images
docker-compose build
## run the docker-compose file
docker-compose up -d

How it Works?

There are 4 different Functionalities:

  1. vectorize (GET): convert Passed Text to Vector Using the Embedding Model

  1. insertion (POST): insert data and its embeddings to caching database.

  2. search (POST): search for similar/identical text based on passed text. here text is vectorized then search in caching database, last thing it to insert it.

  1. refresh (DELETE): refreshing database to clear all records from it.

Usage:

## go to scripts directory
cd scripts
## run the docker compose file
docker-compose up -d