Home

Awesome

Ollama Copilot

Ollama Copilot is a UI for Ollama on Windows that uses Windows Forms.

Copilot responses can be automatically forward to other applications just like other paid copilots.

The Ollama Copilot has other features like speech to text, text to speech, and OCR all using free open-source software.

Check out Releases for the latest installer.

Screenshots

image_1

Videos

Playlist

Overview of Ollama Copilot

<a target="_blank" href="https://www.youtube.com/watch?v=4mKgcgBTwCo"><img src="https://img.youtube.com/vi/4mKgcgBTwCo/0.jpg"/></a>

Ollama Copilot v1.0.0

<a target="_blank" href="https://www.youtube.com/watch?v=Jh6jCRSlclk"><img src="https://img.youtube.com/vi/Jh6jCRSlclk/0.jpg"/></a>

Youtube Transcripts v1.0.1

<a target="_blank" href="https://www.youtube.com/watch?v=lY-6ZdsuHS8"><img src="https://img.youtube.com/vi/lY-6ZdsuHS8/0.jpg"/></a>

Speech to Text v1.0.2

<a target="_blank" href="https://www.youtube.com/watch?v=BBn0__pgHIA"><img src="https://img.youtube.com/vi/BBn0__pgHIA/0.jpg"/></a>

Text to Speech v1.0.3

<a target="_blank" href="https://www.youtube.com/watch?v=rVadgr7rnVk"><img src="https://img.youtube.com/vi/rVadgr7rnVk/0.jpg"/></a>

Optical Character Recognition v1.0.4

<a target="_blank" href="https://www.youtube.com/watch?v=WHM-Bc9k04E"><img src="https://img.youtube.com/vi/WHM-Bc9k04E/0.jpg"/></a>

Dependencies

Visual Studio Build Dependencies

image_5

image_2

image_3

Feature Dependencies

Ollama with Windows preview

ollama run llama3
ollama run llama2
ollama run qwen:4b
ollama run llava
ollama run phi3
ollama run gemma
ollama rm gemma
ollama run gemma:2b

Ollama with Docker

image_4

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama run llama2
docker exec -it ollama ollama run llava
docker exec -it ollama ollama run gemma
docker exec -it ollama ollama run mixtral

Launch Whisper Server to enable local dictation

WSL2

sudo apt-get update
sudo apt install python3-pip
sudo apt install uvicorn
pip3 install FastAPI[all]
pip3 install uvloop
pip3 install numpy
sudo apt-get install curl
sudo apt-get install ffmpeg
pip3 install ffmpeg
pip3 install scipy
pip3 install git+https://github.com/openai/whisper.git
python3 -m uvicorn WhisperServer:app --reload --port 11437

image_6

Useful information on Whisper model sizes.

Available models and languages

There are five model sizes, four with English-only versions, offering speed and accuracy tradeoffs. Below are the names of the available models and their approximate memory requirements and inference speed relative to the large model; actual speed may vary depending on many factors including the available hardware.

SizeParametersEnglish-only modelMultilingual modelRequired VRAMRelative speed
tiny39 Mtiny.entiny~1 GB~32x
base74 Mbase.enbase~1 GB~16x
small244 Msmall.ensmall~2 GB~6x
medium769 Mmedium.enmedium~5 GB~2x
large1550 MN/Alarge~10 GB1x

Test the Whisper mode conversion MP3 to text

python3 WhisperTest.py audio.mp3

Launch Pyttsx3 Server to enable text to speech

Windows

image_7

pip3 install uvicorn
pip3 install FastAPI[all]
pip3 install pyttsx3
python3 -m uvicorn Pyttsx3Server:app --reload --port 11438

Speech Commands

Launch Tesseract-OCR server for real-time OCR

Windows

pip3 install uvicorn
pip3 install FastAPI[all]
pip install pytesseract
python3 -m uvicorn TesseractOCRServer:app --reload --port 11439 --log-level error