Home

Awesome

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

The Large Language Models Survey repository is a comprehensive compendium dedicated to the exploration and understanding of Large Language Models (LLMs). It houses an assortment of resources including research papers, blog posts, tutorials, code examples, and more to provide an in-depth look at the progression, methodologies, and applications of LLMs. This repo is an invaluable resource for AI researchers, data scientists, or enthusiasts interested in the advancements and inner workings of LLMs. We encourage contributions from the wider community to promote collaborative learning and continue pushing the boundaries of LLM research.

Timeline of LLMs

evolutionv1 1

List of LLMs

Language ModelRelease DateCheckpointsPaper/BlogParams (B)Context LengthLicenceTry it
T52019/10T5 & Flan-T5, Flan-T5-xxl (HF)Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer0.06 - 11512Apache 2.0T5-Large
UL22022/10UL2 & Flan-UL2, Flan-UL2 (HF)UL2 20B: An Open Source Unified Language Learner20512, 2048Apache 2.0
Cohere2022/06Checkpoint Code544096ModelWebsite
Cerebras-GPT2023/03Cerebras-GPTCerebras-GPT: A Family of Open, Compute-efficient, Large Language Models (Paper)0.111 - 132048Apache 2.0Cerebras-GPT-1.3B
Open Assistant (Pythia family)2023/03OA-Pythia-12B-SFT-8, OA-Pythia-12B-SFT-4, OA-Pythia-12B-SFT-1Democratizing Large Language Model Alignment122048Apache 2.0Pythia-2.8B
Pythia2023/04pythia 70M - 12BPythia: A Suite for Analyzing Large Language Models Across Training and Scaling0.07 - 122048Apache 2.0
Dolly2023/04dolly-v2-12bFree Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM3, 7, 122048MIT
DLite2023/05dlite-v2-1_5bAnnouncing DLite V2: Lightweight, Open LLMs That Can Run Anywhere0.124 - 1.51024Apache 2.0DLite-v2-1.5B
RWKV2021/08RWKV, ChatRWKVThe RWKV Language Model (and my LM tricks)0.1 - 14infinity (RNN)Apache 2.0
GPT-J-6B2023/06GPT-J-6B, GPT4All-JGPT-J-6B: 6B JAX-Based Transformer62048Apache 2.0
GPT-NeoX-20B2022/04GPT-NEOX-20BGPT-NeoX-20B: An Open-Source Autoregressive Language Model202048Apache 2.0
Bloom2022/11BloomBLOOM: A 176B-Parameter Open-Access Multilingual Language Model1762048OpenRAIL-M v1
StableLM-Alpha2023/04StableLM-AlphaStability AI Launches the First of its StableLM Suite of Language Models3 - 654096CC BY-SA-4.0
FastChat-T52023/04fastchat-t5-3b-v1.0We are excited to release FastChat-T5: our compact and commercial-friendly chatbot!3512Apache 2.0
h2oGPT2023/05h2oGPTBuilding the World’s Best Open-Source Large Language Model: H2O.ai’s Journey12 - 20256 - 2048Apache 2.0
MPT-7B2023/05MPT-7B, MPT-7B-InstructIntroducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs784k (ALiBi)Apache 2.0, CC BY-SA-3.0
PanGU-Σ2023/3PanGUModel1085-ModelPage
RedPajama-INCITE2023/05RedPajama-INCITEReleasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models3 - 72048Apache 2.0RedPajama-INCITE-Instruct-3B-v1
OpenLLaMA2023/05open_llama_3b, open_llama_7b, open_llama_13bOpenLLaMA: An Open Reproduction of LLaMA3, 72048Apache 2.0OpenLLaMA-7B-Preview_200bt
Falcon2023/05Falcon-180B, Falcon-40B, Falcon-7BThe RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only180, 40, 72048Apache 2.0
MPT-30B2023/06MPT-30B, MPT-30B-instructMPT-30B: Raising the bar for open-source foundation models308192Apache 2.0, CC BY-SA-3.0MPT 30B inference code using CPU
LLaMA 22023/06LLaMA 2 Weights Llama 2: Open Foundation and Fine-Tuned Chat Models7 - 704096Custom Free if you have under 700M users and you cannot use LLaMA outputs to train other LLMs besides LLaMA and its derivativesHuggingChat
OpenLM2023/09OpenLM 1B, OpenLM 7B Open LM: a minimal but performative language modeling (LM) repository1, 72048MIT
Mistral 7B2023/09Mistral-7B-v0.1, Mistral-7B-Instruct-v0.1Mistral 7B74096-16K with Sliding WindowsApache 2.0Mistral Transformer
OpenHermes2023/09OpenHermes-7B, OpenHermes-13BNous Research7, 134096MITOpenHermes-V2 Finetuned on Mistral 7B
SOLAR2023/12Solar-10.7BUpstage10.74096apache-2.0
phi-22023/12phi-2 2.7BMicrosoft2.72048MIT
SantaCoder2023/01santacoderSantaCoder: don't reach for the stars!1.12048OpenRAIL-M v1SantaCoder
StarCoder2023/05starcoderStarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you!1.1-158192OpenRAIL-M v1
StarChat Alpha2023/05starchat-alphaCreating a Coding Assistant with StarCoder168192OpenRAIL-M v1
Replit Code2023/05replit-code-v1-3bTraining a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit2.7infinity? (ALiBi)CC BY-SA-4.0Replit-Code-v1-3B
CodeGen22023/04codegen2 1B-16BCodeGen2: Lessons for Training LLMs on Programming and Natural Languages1 - 162048Apache 2.0
CodeT5+2023/05CodeT5+CodeT5+: Open Code Large Language Models for Code Understanding and Generation0.22 - 16512BSD-3-ClauseCodet5+-6B
XGen-7B2023/06XGen-7B-8K-BaseLong Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length78192Apache 2.0
CodeGen2.52023/07CodeGen2.5-7B-multiCodeGen2.5: Small, but mighty72048Apache 2.0
DeciCoder-1B2023/08DeciCoder-1BIntroducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation1.12048Apache 2.0DeciCoder Demo
Code Llama2023Inference Code for CodeLlama models Code Llama: Open Foundation Models for Code7 - 344096ModelHuggingChat
Sparrow2022/09Inference Code Code704096ModelWebpage
Mistral2023/09Inference Code Code78000ModelWebpage
Koala2023/04Inference Code Code134096ModelWebpage
PaLM 22024N/AGoogle AI540N/AN/AN/A
Tongyi Qianwen2024N/AAlibaba CloudN/AN/AN/AN/A
Cohere Command2024N/ACohere6 - 52N/AN/AN/A
Vicuna 33B2024N/AMeta AI33N/AN/AN/A
Guanaco-65B2024N/AMeta AI65N/AN/AN/A
Amazon Q2024N/AAWSN/AN/AN/AN/A
Falcon 180B2024Falcon-180BTechnology Innovation Institute180N/AApache 2.0N/A
YI 34B2024N/A01 AI34Up to 32KN/AN/A
Mixtral 8x7B2023Mixtral 8X 7BMistral AI46.7 (12.9 per token)N/AApache 2.0N/A

If you find our survey useful for your research, please cite the following paper:

@article{hadi2024large,
  title={Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects},
  author={Hadi, Muhammad Usman and Al Tashi, Qasem and Shah, Abbas and Qureshi, Rizwan and Muneer, Amgad and Irfan, Muhammad and Zafar, Anas and Shaikh, Muhammad Bilal and Akhtar, Naveed and Wu, Jia and others},
  journal={Authorea Preprints},
  year={2024},
  publisher={Authorea}
}