Home

Awesome

vnc-lm

<sub>11-05-2024: Added support for vision on hosted APIs </sub> <br> <sub>11-01-2024: Added support for hosted APIs </sub> <br> <sub>10-27-2024: Added prompt refining </sub>

Introduction

vnc-lm is a Discord bot with ollama, OpenRouter, Mistral, Cohere, and GitHub Models API integration.

Load and manage language models through local or hosted API endpoints. Configure parameters, branch conversations, and refine prompts to improve responses.

380467366-1a9b7415-f4c0-46b3-9c66-9bef9c6d4cf1(1) <br> <sup>Web scraping</sup> <br> 380467486-8633d114-32f9-40df-ac0c-f56aea3c07d3(1) <br> <sup>Model pulling with ollama</sup>

Features

Model Management

Load models using the /model command. The bot sends notifications upon successful model loading. Local models can be removed with the remove parameter. Download new models by sending a model tag link in Discord.

https://ollama.com/library/llama3.2:1b-instruct-q8_0
https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/blob/main/Llama-3.2-1B-Instruct-Q8_0.gguf

🚧 Model downloading and removal is turned off by default and can be enabled by configuring the .env.

Configure model behavior by adjusting the num_ctx (context length), system_prompt (base instructions), and temperature (response randomness) parameters.

QoL Improvements

Messages longer than 1500 characters are automatically paginated during generation. Message streaming is available with ollama. Other APIs handle responses quickly without streaming. The context window accepts text files, web links, and images. Vision is available only through the supported hosted APIs. Even on hosted APIs, not all models support vision capabilities. Models running locally need OCR to process text from images. Deploy using Docker for a simplified setup.

Switch conversations by selecting rejoin conversation from the context menu. Branch conversations from any message. Messages are cached and organized in bot_cache.json. Messages deleted in Discord are also deleted from the cache. The entrypoint.sh script maintains conversation history across Docker container restarts.

💡 Message stop to end message generation early.

Edit your last prompt to refine the model's response. The bot generates a new response using your edited prompt, replacing the previous output.

Requirements

Docker: Docker is a platform designed to help developers build, share, and run container applications. We handle the tedious setup, so you can focus on the code.

Supported APIs

ProviderDescription
ollamaGet up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
OpenRouterA unified interface for LLMs. Find the best models & prices for your prompts. Use the latest state-of-the-art models from OpenAI, Anthropic, Google, and Meta.
MistralMistral AI is a research lab building the best open source models in the world. La Plateforme enables developers and enterprises to build new products and applications, powered by Mistral's open source and commercial LLMs.
CohereThe Cohere platform builds natural language processing and generation into your product with a few lines of code. Our large language models can solve a broad spectrum of natural language use cases, including classification, semantic search, paraphrasing, summarization, and content generation.
GitHub ModelsIf you want to develop a generative AI application, you can use GitHub Models to find and experiment with AI models for free. Once you are ready to bring your application to production, you can switch to a token from a paid Azure account.

💡 Each API offers a free tier.

Environment Configuration

git clone https://github.com/jake83741/vnc-lm.git
cd vnc-lm

Rename .env.example to .env.

Configure the below fields in the .env:

TOKEN: Discord bot token from the Discord Developer Portal. Set required bot permissions.<br> OLLAMAURL: ollama server URL. See API documentation. For Docker: http://host.docker.internal:11434<br> NUM_CTX: Context window size. Default: 2048<br> TEMPERATURE: Response randomness. Default: 0.4<br> KEEP_ALIVE: Model retention time in memory. Default: 45m<br> CHARACTER_LIMIT: Page embed character limit. Default: 1500<br> API_RESPONSE_UPDATE_FREQUENCY: API response chunk size before message updates. Low values trigger Discord throttling. Default: 10<br> ADMIN: Discord user ID for model management permissions<br> REQUIRE_MENTION: Toggle bot mention requirement. Default: false<br> USE_OCR: Toggle OCR. Default: false<br> OPENROUTER: OpenRouter API key from the OpenRouter Dashboard<br> OPENROUTER_MODELS: Comma-separated OpenRouter model list<br> MISTRAL_API_KEY: Mistral API key from the Mistral Dashboard<br> MISTRAL_MODELS: Comma-separated Mistral model list<br> COHERE_API_KEY: Cohere API key from the Cohere Dashboard<br> COHERE_MODELS: Comma-separated Cohere model list<br> GITHUB_API_KEY: GitHub API key from the GitHub Models Dashboard<br> GITHUB_MODELS: Comma-separated GitHub model list<br>

Docker Installation (Preferred)

docker compose up --build

💡 Send /help for instructions on how to use the bot.

Manual Installation

<details> <br>
npm install
npm run build
npm start
</details>

Usage

382479369-982b64f9-fb2d-44ed-b4dc-ef4d9884ca2b(1) <br> Use /model to load, configure, and remove models. Quickly adjust model behavior using the optional parameters num_ctx, system_prompt, and temperature. Note that num_ctx only works with local ollama models. <br> <br>

382480051-facb0135-c3e8-40f2-8dc4-c56bc022c8f3(1) <br> Refine prompts to modify model responses. Each refinement generates a new response that overwrites the previous one. Multiple refinements are supported. The latest prompt version is saved in bot_cache.json. <br> <br>

Screen Recording 2024-11-05 at 12 50 49 AM <br> Send images to vision-enabled models to process visual content alongside text. Images are included directly in the conversation context. Images are encoded in Base64 before being sent to the API. <br> <br>

380472450-2b4b0dda-b905-447d-931b-15352442ca29 <br> Access Rejoin Conversation in Discord's context menu to resume from any message. Hop between conversations while maintaining context. Create new conversation branches as needed. Continue conversations using different models and parameter settings.

Tree Diagram

.
├── LICENSE
├── README.md
├── docker-compose.yaml
├── dockerfile
├── .env.example
├── package.json
├── screenshots
├── src
├── api-connections
│   ├── config
│   │   └── models.ts
│   ├── factory.ts
│   ├── index.ts
│   ├── interfaces
│   │   ├── base-client.ts
│   │   └── model-manager.ts
│   ├── models.ts
│   └── provider
│       ├── hosted
│       │   └── client.ts
│       └── ollama
│           └── client.ts
├── bot.ts
├── commands
│   ├── command-registry.ts
│   ├── help-command.ts
│   ├── model-command.ts
│   ├── optional-params
│   │   └── remove.ts
│   └── rejoin-conversation.ts
├── managers
│   ├── cache
│   │   ├── entrypoint.sh
│   │   ├── index.ts
│   │   ├── manager.ts
│   │   └── store.ts
│   ├── generation
│   │   ├── chunk.ts
│   │   ├── create.ts
│   │   └── preprocessing.ts
│   ├── message
│   │   └── manager.ts
│   └── pages
│       └── manager.ts
├── services
│   ├── ocr.ts
│   └── scraper.ts
└── utilities
    ├── constants.ts
    ├── index.ts
    ├── settings.ts
    └── types.ts
└── tsconfig.json

Dependencies

<details> <br>
{
  "dependencies": {
    "@azure-rest/ai-inference": "latest",
    "@azure/core-auth": "latest",
    "@mozilla/readability": "^0.5.0",
    "@types/xlsx": "^0.0.35",
    "axios": "^1.7.2",
    "cohere-ai": "^7.14.0",
    "discord.js": "^14.15.3",
    "dotenv": "^16.4.5",
    "jsdom": "^24.1.3",
    "puppeteer": "^22.14.0",
    "sharp": "^0.33.5",
    "tesseract.js": "^5.1.0"
  },
  "devDependencies": {
    "@types/axios": "^0.14.0",
    "@types/dotenv": "^8.2.0",
    "@types/jsdom": "^21.1.7",
    "@types/node": "^18.15.25",
    "@types/pdf-parse": "^1.1.4",
    "typescript": "^5.1.3"
  }
}
</details>

Notes

<details> <br>
  1. Set higher num_ctx values when using attachments with large amounts of text
  2. Vision models may have difficulty with follow-up questions.
</details>

License

This project is licensed under the MIT License.