Home

Awesome

AIKit ✨

<p align="center"> <img src="./website/static/img/logo.png" width="200"><br> </p>

AIKit is a comprehensive platform to quickly get started to host, deploy, build and fine-tune large language models (LLMs).

AIKit offers two main capabilities:

👉 For full documentation, please see AIKit website!

Features

Quick Start

You can get started with AIKit quickly on your local machine without a GPU!

docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b

After running this, navigate to http://localhost:8080/chat to access the WebUI!

API

AIKit provides an OpenAI API compatible endpoint, so you can use any OpenAI API compatible client to send requests to open LLMs!

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "llama-3.1-8b-instruct",
    "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
  }'

Output should be similar to:

{
  // ...
    "model": "llama-3.1-8b-instruct",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
            }
        }
    ],
  // ...
}

That's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.

Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!

CPU

ModelOptimizationParametersCommandModel NameLicense
🦙 Llama 3.1Instruct8Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8bllama-3.1-8b-instructLlama
🦙 Llama 3.1Instruct70Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:70bllama-3.1-70b-instructLlama
Ⓜ️ MixtralInstruct8x7Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7bmixtral-8x7b-instructApache
🅿️ Phi 3Instruct3.8Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8bphi-3-3.8bMIT
🔡 Gemma 2Instruct2Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma2:2bgemma-2-2b-instructGemma
⌨️ Codestral 0.1Code22Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22bcodestral-22bMNLP

NVIDIA CUDA

[!NOTE] To enable GPU acceleration, please see GPU Acceleration. Please note that only difference between CPU and GPU section is the --gpus all flag in the command to enable GPU acceleration.

ModelOptimizationParametersCommandModel NameLicense
🦙 Llama 3.1Instruct8Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8bllama-3.1-8b-instructLlama
🦙 Llama 3.1Instruct70Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:70bllama-3.1-70b-instructLlama
Ⓜ️ MixtralInstruct8x7Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7bmixtral-8x7b-instructApache
🅿️ Phi 3Instruct3.8Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8bphi-3-3.8bMIT
🔡 Gemma 2Instruct2Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma2:2bgemma-2-2b-instructGemma
⌨️ Codestral 0.1Code22Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22bcodestral-22bMNLP

What's next?

👉 For more information and how to fine tune models or create your own images, please see AIKit website!