Awesome

Learning Ollama

Just one of the things I'm learning. https://github.com/hchiam/learning

Ollama makes it easy to run LLMs offline/locally/privately on your computer.

ollama.com website download app to install ollama command
ollama run llama2 automatically downloads llama2 model if needed, and lets you talk with the model offline directly in the terminal, with the ollama app running in the background
- (otherwise to just use another terminal window to run ollama in the background: ollama serve)

in .bash_profile: alias llm='ollama run llama2'
to read a local file: ollama run llama2 "Summarize this file: $(cat scratchpad.md)"
or with the custom alias: llm "Summarize this file: $(cat scratchpad.md)"
or run ollama with python code: https://github.com/hchiam/local-llms-analyse-finance/blob/main/categorize_expenses.ipynb
but overall, it seems:
- running llama2 on ollama has limited speed (unless your computer has GPUs that can Ollama can use for GPU acceleration)
- running llama2 on ollama has limited prompt/input size, which would require chopping up the input

pip install langchain
# langchain_community might need to installed with pip3 install langchain-community langchain-core

ollama run gemma2 # 5.4 GB when i tried
ollama pull nomic-embed-text # 274 MB
ollama list

python ollama_gemma2_rag_debugging.py

python ollama_gemma2_rag_simple.py