Home

Awesome

SearchAugmentedLLM

SearchAugmentedLLM empowers Large Language Models (LLMs) with relevant, up-to-date information from the web.
Given a user query, it performs a Google search, processes the top search results, chunks the content, ranks by relevance, and returns the most pertinent text to provide context for improved LLM responses. This tool is ideal for Retrieval Augmented Generation (RAG) applications.

Features

API

This project was programmed to be used via REST API, you can use it either on localhost or on an external hosting.

API Parameters

When making an HTTP request to the API, you can pass the following parameters (GET and POST are supported):

All parameters are optional except query

ParameterDescriptionRequiredDefault
queryThe search queryYes
time_outMaximum time (seconds) for a request to each linkNo5
max_resultsMaximum number of Google Search resultsNo5
max_chunksMaximum number of chunks to generateNo100
do_rerankRerank results for better quality (requires Cohere API key)Notrue
max_seqMaximum word length inside a chunk (longer sequences are removed)No51
min_charMinimum number of characters per chunkNo300
max_charMaximum number of characters per chunk (must be > min_char + max_seq)No450
max_characters_outputMaximum number of characters in the outputNo2500

Google CSE API Key

To search using Google, you will need the Google CSE (Custom Search Engine) API Key and CX ID

First, create a custom search here Google CSE Panel

Copy your CX ID -> go to this page on Google Developers and click Get a Key to get your API key. Rename the .env.example file to .env and put your CX and API key in the appropriate variable

Rerank With Cohere

To rerank you will also need to configure a Cohere API key in .env.

Get your Cohere API key here: https://dashboard.cohere.com/api-keys

Cybersecurity

This project was developed mainly for home use via localhost. If you want to use it on a public hosting, it is recommended to add some restriction layer with login.

License

MIT - This project is licensed under the MIT License.
Please note that this project is currently in beta and is provided "as is" without warranty of any kind.

Acknowledgements

This project leverages the following resources:

Readability PHP library by FiveFilters - https://github.com/fivefilters/readability.php

Thanks to FiveFilters for their valuable work!

Cohere API: Used for re-ranking content.

Google CSE for search the web.