Awesome

ChadGPT

A LLM is a data structure. Given k tokens, it outputs the probability of the next token.
ChatGPT4 is approching Library of Congress training size. We are running out of data.
New data will probably come from reflection - prompting an LLM with it's own output for deeper insight.
BabyGPT - a three bit LLM
Tokenize - Train - Infer trained model - Quantize model to shrink size
OpenAI TickToken - high performance tokenizer.
LLM Training can cost millions - OpenAI burned spare GPU at Azure WDM after Bitcoin crash as tax writeoff.
LLM training and inference involve mostly tensor (matrix) operations.
Cerebras wafer scale chips - Tenstorrent accelerator chips
Once you have a LLM - fine-tuning can cost as low as $3
vast.ai - lambdalabs - Google Colab Notebooks
PicoGPT
NanoGPT - video tutorial

ChatGPT4 Demo - GitHub Copilot demo if you want
whisper.cpp - use OpenAI whisper to transcribe audio to text - WANTED live transcription for meetings.
AutoGPT - BabyAGI - use GPT and scripts to drive other GPTs and scripts.
Reddit GPT has good weekly briefings.

llama.cpp - a fork of whisper.cpp - most widely used C++ code to host your own LLM.
Huggingface - stores open models as Git LFS.
web-llm - uses WebGPU to run in the LLM in your browser.

Linux 7 will have a LLM of various sizes and an SMT solver to prove responses correct.
CGROUPS3 - closer to AWS Zelkova and AWS IAM
Kernel LLM will be used as a dictionary for data compression.
Oxide Computer size racks will have distrubuted linux schedulers. Kubernetes goes extinct.
More systems code like compilers will run on GPU.