Awesome
LLM Zoomcamp
<p align="center"> <img src="images/llm-zoomcamp.jpg" /> </p>LLM Zoomcamp - a free online course about real-life applications of LLMs. In 10 weeks you will learn how to build an AI system that answers questions about your knowledge base.
<p align="center"> <a href="https://airtable.com/appPPxkgYLH06Mvbw/shr7WtxHEPXxaui0Q"><img src="https://user-images.githubusercontent.com/875246/185755203-17945fd1-6b64-46f2-8377-1011dcb1a444.png" height="50" /></a> </p>- Give us a star to support the course!
- Register in DataTalks.Club's Slack
- Join the
#course-llm-zoomcamp
channel - Join the course Telegram channel with announcements
- The videos are published on DataTalks.Club's YouTube channel in the course playlist
- Frequently asked technical questions
- Course Calendar
2025 cohort
- Start date: TBA (Spring-Summer 2025)
Self-paced mode
- You can watch the course at your own pace
- Just follow the modules and watch the videos
- Don't forget to do the homework to make sure you learned the materials
- We strongly suggest doing a project and then sharing it in slack to ask for feedback
Pre-requisites
- Comfortable with programming and Python
- Comfortable with command line
- Docker
- No previous exposure to AI or ML is required
Syllabus
We encourage Learning in Public
Pre-course workshops
Implement a search engine: Video, code
1. Introduction to LLMs and RAG
- LLMs and RAG
- Preparing the environment
- Retrieval and the basics of search
- OpenAI API
- Simple RAG with Open AI
- Text search with Elasticsearch
2. Open-source LLMs
- Getting an environment with a GPU
- Open-source models from HuggingFace Hub
- Running LLMs on a CPU with Ollama
- Creating a simple UI with Streamlit
3. Vector databases
- Vector search
- Creating and indexing embeddings
- Vector search with Elasticsearch
- Offline evaluation of retrieval
Workshop: dlt
4. Evaluation and monitoring
- Offline evaluation of RAG
- Cosine and LLM-as-a-Judge metrics
- Tracking chat history and user feedback
- Creating dashboards with Grafana for visualization
5. LLM orchestration and ingestion
- Ingesting data with Mage
6. Best practices
- Techniques to improve RAG pipeline
- Hybrid search
- Document reranking
- Hybrid search with LangChain
7. Bonus: End-to-End project example (Optional)
- Building an end-to-end fitness assistant project
- Examples of pre-processing text datasets
LLM Zoomcamp 2024 Competition
Hands-on project
<p align="center"> <a href="https://airtable.com/appPPxkgYLH06Mvbw/shr7WtxHEPXxaui0Q"><img src="https://user-images.githubusercontent.com/875246/185755203-17945fd1-6b64-46f2-8377-1011dcb1a444.png" height="50" /></a> </p>Instructors
Asking questions
The best way to get support is to use DataTalks.Club's Slack. Join the #course-llm-zoomcamp
.
To make discussions in Slack more organized:
- Follow these recommendations when asking for help
- Read the DataTalks.Club community guidelines
Supporters and partners
Thanks to the course sponsors for making it possible to run this course
<p align="center"> <a href="https://mage.ai/"> <img height="120" src="https://github.com/DataTalksClub/data-engineering-zoomcamp/raw/main/images/mage.svg"> </a> </p> <p align="center"> <a href="https://dlthub.com/"> <img height="80" src="https://github.com/DataTalksClub/data-engineering-zoomcamp/raw/main/images/dlthub.png"> </a> </p> <p align="center"> <a href="https://saturncloud.io/"> <img height="120" src="images/saturn-cloud.png"> </a> </p>Do you want to support our course and our community? Please reach out to alexey@datatalks.club