Home

Awesome

Ollama Grid Search: Instantly Evaluate Multiple LLMs and Prompts.

This project automates the process of selecting the best models, prompts, or inference parameters for a given use-case, allowing you to iterate over their combinations and to visually inspect the results.

It assumes Ollama is installed and serving endpoints, either in localhost or in a remote server.

Here's what an experiment for a simple prompt, tested on 3 different models, looks like:

<img src="./screenshots/main.png?raw=true" alt="Main Screenshot" width="720">

(For a more in-depth look at an evaluation process assisted by this tool, please check https://dezoito.github.io/2023/12/27/rust-ollama-grid-search.html).

Table of Contents

Installation

Check the releases page for the project, or on the sidebar.

Features

Grid Search (or something similar...)

Technically, the term "grid search" refers to iterating over a series of different model hyperparams to optimize model performance, but that usually means parameters like batch_size, learning_rate, or number_of_epochs, more commonly used in training.

But the concept here is similar:

Lets define a selection of models, a prompt and some parameter combinations:

<img src="./screenshots/gridparams-animation.gif?raw=true" alt="gridparams" width="400">

The prompt will be submitted once for each parameter value, for each one of the selected models, generating a set of responses.

A/B Testing

Similarly, you can perform A/B tests by selecting different models and compare results for the same prompt/parameter combination, or test different prompts under similar configurations:

<img src="./screenshots/ab-animation.gif?raw=true" alt="A/B testing" width="720">

<small>Comparing the results of different prompts for the same model</small>

Prompt Archive

You can save and manage your prompts (we want to make prompts compatible with Open WebUI)

<img src="./screenshots/prompt-archive.png?raw=true" alt="Settings" width="720">

You can autocomplete prompts by typing "/" (inspired by Open WebUI, as well):

<img src="./screenshots/autocomplete.gif?raw=true" alt="A/B testing" width="720">

Experiment Logs

You can list, inspect, or download your experiments:

<img src="./screenshots/experiments.png?raw=true" alt="Settings" width="720">

Future Features

Contributing

Development

  1. Make sure you have Rust installed.

  2. Clone the repository (or a fork)

git clone https://github.com/dezoito/ollama-grid-search.git
cd ollama-grid-search
  1. Install the frontend dependencies.

    cd <project root>
    # I'm using bun to manage dependencies,
    # but feel free to use yarn or npm
    bun install
    
  2. Make sure rust-analyzer is configured to run Clippy when checking code.

    If you are running VS Code, add this to your settings.json file

    {
       ...
       "rust-analyzer.check.command": "clippy",
    }
    

    (or, better yet, just use the settings file provided with the code)

  3. Run the app in development mode

    cd <project root>/
    bun tauri dev
    
  4. Go grab a cup of coffee because this may take a while.

Citations

The following works and theses have cited this repository:

Inouye, D & Lindo, L, & Lee, R & Allen, E; Computer Science and Engineering Senior Theses: Applied Auto-tuning on LoRA Hyperparameters Santa Clara University, 2024 https://scholarcommons.scu.edu/cgi/viewcontent.cgi?article=1271&context=cseng_senior

Thank you!

Huge thanks to @FabianLars, @peperroni21 and @TomReidNZ.