Home

Awesome

<h1 align="center">Perplexity-Inspired LLM Answer Engine</h1> <div> <div align="center"> <a href="https://twitter.com/dev__digest"> <img src="https://img.shields.io/badge/X/Twitter-000000?style=for-the-badge&logo=x&logoColor=white" /> </a> <a href="https://www.youtube.com/@developersdigest"> <img src="https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white" /> </a> </div> <div align="center"> <a href="https://trendshift.io/repositories/8642" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8642" alt="developersdigest%2Fllm-answer-engine | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> </div> </div> <div align="center"> <img src="https://media0.giphy.com/media/v1.Y2lkPTc5MGI3NjExcjVodHcyZWd0MDJtd2RiN2xqbGdtOTdrYzZiMnhlMmZidDRzYm15dSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/PXkHjFlbgty03C6TAL/giphy.gif"><br> </div>

This repository contains the code and instructions needed to build a sophisticated answer engine that leverages the capabilities of Groq, Mistral AI's Mixtral, Langchain.JS, Brave Search, Serper API, and OpenAI. Designed to efficiently return sources, answers, images, videos, and follow-up questions based on user queries, this project is an ideal starting point for developers interested in natural language processing and search technologies.

YouTube Tutorials

<div style="display: flex; justify-content: center; align-items: center;"> <a href="https://youtu.be/43ZCeBTcsS8"> <img src="https://img.youtube.com/vi/43ZCeBTcsS8/0.jpg" alt="Tutorial 2" style="width: 24%; height: auto;"> </a> <a href="https://youtu.be/kFC-OWw7G8k"> <img src="https://img.youtube.com/vi/kFC-OWw7G8k/0.jpg" alt="Tutorial 1" style="width: 24%; height: auto;"> </a> <a href="https://youtu.be/kV2U7ttqE-g"> <img src="https://img.youtube.com/vi/kV2U7ttqE-g/0.jpg" alt="Tutorial 3" style="width: 24%; height: auto;"> </a> <a href="https://youtu.be/3_aNVu6EU3Y"> <img src="https://img.youtube.com/vi/3_aNVu6EU3Y/0.jpg" alt="Tutorial 4" style="width: 24%; height: auto;"> </a> </div>

Technologies Used

Getting Started

Prerequisites

Prerequisites for Non-Docker Installation

Prerequisites for Docker Installation

Obtaining API Keys

Quick Clone and Deploy

Simple, Easy, Fast and Free - deploy to vercel

Make Sure to fill all the API Keys required for the Installation.

Deploy with Vercel

Installation

  1. Clone the repository:
    git clone https://github.com/developersdigest/llm-answer-engine.git
    
  2. Move in the directory
    cd llm-answer-engine
    

Docker Installation

  1. Edit the docker-compose.yml file and add your API keys

  2. Running the Server

To start the server, execute:

docker compose up -d #for v2

or

docker-compose up -d #for v1

the server will be listening on the specified port.

Non-Docker Installation

  1. Install the required dependencies:
    npm install
    
    or
    bun install
    
  2. Create a .env file in the root of your project and add your API keys:
    OPENAI_API_KEY=your_openai_api_key
    GROQ_API_KEY=your_groq_api_key
    BRAVE_SEARCH_API_KEY=your_brave_search_api_key
    SERPER_API=your_serper_api_key
    
  3. Running the Server

To start the server, execute:

npm run dev

or

bun run dev

the server will be listening on the specified port.

Editing the Configuration

The configuration file is located in the app/config.tsx file. You can modify the following values

Function Calling Support (Beta)

Currently, function calling is supported with the following capabilities:

Ollama Support (Partially supported)

Currently, streaming text responses are supported for Ollama, but follow-up questions are not yet supported.

Embeddings are supported, however, time-to-first-token can be quite long when using both a local embedding model as well as a local model for the streaming inference. I recommended decreasing a number of the RAG values specified in the app/config.tsx file to decrease the time-to-first-token when using Ollama.

To get started, make sure you have the Ollama running model on your local machine and set within the config the model you would like to use and set use OllamaInference and/or useOllamaEmbeddings to true.

Note: When 'useOllamaInference' is set to true, the model will be used for both text generation, but it will skip the follow-up questions inference step when using Ollama.

More info: https://ollama.com/blog/openai-compatibility

Roadmap

Example

Backend + Node Only Express API

Watch the express tutorial here for a detailed guide on setting up and running this project. In addition to the Next.JS version of the project, there is a backend only version that uses Node.js and Express. Which is located in the 'express-api' directory. This is a standalone version of the project that can be used as a reference for building a similar API. There is also a readme file in the 'express-api' directory that explains how to run the backend version.

Upstash Redis Rate Limiting

Watch the Upstash Redis Rate Limiting tutorial here for a detailed guide on setting up and running this project. Upstash Redis Rate Limiting is a free tier service that allows you to set up rate limiting for your application. It provides a simple and easy-to-use interface for configuring and managing rate limits. With Upstash, you can easily set limits on the number of requests per user, IP address, or other criteria. This can help prevent abuse and ensure that your application is not overwhelmed with requests.

Contributing

Contributions to the project are welcome. Feel free to fork the repository, make your changes, and submit a pull request. You can also open issues to suggest improvements or report bugs.

License

This project is licensed under the MIT License.

Star History Chart

I'm the developer behind Developers Digest. If you find my work helpful or enjoy what I do, consider supporting me. Here are a few ways you can do that: