Home

Awesome

Awesome Text2SQL🎉🎉🎉

GitHub Repo stars GitHub Repo forks Awesome License: MIT last commit

English | 中文版 | Paper

Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL, Text2API, Text2Vis and more.

🌱 How to Contribute

We warmly welcome contributions from everyone, whether you've found a typo, a bug, have a suggestion, or want to share a resource related to LLM+Text2SQL. For detailed guidelines on how to contribute, please see our CONTRIBUTING.md file.

🔔 Leaderboard

WikiSQLSpider<br/>Exact Match(EM)Spider<br/>Exact Execution(EX)BIRD<br/>Valid Efficiency Score (VES)BIRD<br/>Execution Accuracy (EX)
🏆193.0 <br/>(2021/05-SeaD+Execution-Guided Decoding)81.5 <br/>(2023/11-MiniSeek)91.2 <br/>(2023/11-MiniSeek)80.40 <br/>(2024/05-ExSL + granite-20b-code)71.83 <br/>(2024/07-Distillery + GPT-4o)
🥈292.7 <br/>(2021/03-SDSQL+Execution-Guided Decoding)74.0 <br/>(2022/09-Graphix-3B + PICARD)86.6 <br/>(2023/08-DAIL-SQL + GPT-4 + Self-Consistency)77.74 <br/>(2024/07-Distillery + GPT-4o)70.37 <br/>(2024/05-ExSL + granite-34b-code)
🥉392.5 <br/>(2020/11-IE-SQL+Execution-Guided Decoding)73.9 <br/>(2022/09-CatSQL + GraPPa)86.2 <br/>(2023/08-DAIL-SQL + GPT-4)76.11 <br/>(2024/07-RECAP + Gemini)69.03 <br/>(2024/07-RECAP + Gemini)
492.2 <br/>(2020/03-HydraNet+Execution-Guided Decoding)73.1 <br/>(2022/09-SHiP + PICARD)85.6 <br/>(2023/10-DPG-SQL + GPT-4 + Self-Correction)73.24 <br/>(2024/07-ByteBrain)68.87 <br/>(2024/07-ByteBrain)
591.9 <br/>(2020/12-BRIDGE+Execution-Guided Decoding)72.9 <br/>(2022/05-G³R + LGESQL + ELECTRA)85.3 <br/>(2023/04-DIN-SQL + GPT-4)72.78 <br/>(2024/05-ExSL + granite-20b-code)67.86 <br/>(2024/05-ExSL + granite-20b-code)
691.8 <br/>(2019/08-X-SQL+Execution-Guided Decoding)72.4 <br/>(2022/08-RESDSQL+T5-1.1-lm100k-xl)83.9 <br/>(2023/07-Hindsight Chain of Thought with GPT-4)72.63 <br/>(2024/05-CHESS) )66.69 <br/>(2024/05-CHESS)
791.4 <br/>(2021/03-SDSQL)72.4 <br/>(2022/05-T5-SR)82.3 <br/>(2023/06-C3 + ChatGPT + Zero-Shot)71.35 <br/>(2024/01-MCS-SQL + GPT-4)65.45 <br/>(2024/01-MCS-SQL + GPT-4)
891.1 <br/>(2020/12-BRIDGE)72.2 <br/>(2022/12-N-best List Rerankers + PICARD)80.8 <br/>(2023/07-Hindsight Chain of Thought with GPT-4 and Instructions)69.56 <br/>(2024/04-GRA-SQL)65.34 <br/>(2024/07-Insights AI)
991.0 <br/>(2021/04-Text2SQLGen + EG)72.1 <br/>(2021/09-S²SQL + ELECTRA )79.9 <br/>(2023/02-RESDSQL-3B + NatSQ)68.90 <br/>(2024/02-PB-SQL)64.95 <br/>(2024/04-OpenSearch-SQL,v1 + GPT-4)
1090.5 <br/>(2020/11-SeqGenSQL+EG)72.0 <br/>(2023/02-RESDSQL-3B + NatSQL)78.5 <br/>(2022/11-SeaD + PQL)68.82 <br/>(2024/07-Insights AI)64.84 <br/>(2024/02-PB-SQL v1)

📜 Contents

👋 Introduction

📖 Survey

💬 Classic Model

🔥 Base Model

💡 Fine-tuning

💪 Dataset

🌈 Evaluation Index

📦 Libraries

🔧 Practice Project

🔗 Citation

If you find Text2SQL useful for your research or development, please cite the following <a href="https://arxiv.org/abs/2406.11434" target="_blank">paper</a>:

@misc{zhou2024dbgpthub,
      title={DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models}, 
      author={Fan Zhou and Siqiao Xue and Danrui Qi and Wenhui Shi and Wang Zhao and Ganglin Wei and Hongyang Zhang and Caigai Jiang and Gangwei Jiang and Zhixuan Chu and Faqiang Chen},
      year={2024},
      eprint={2406.11434},
      archivePrefix={arXiv},
      primaryClass={id='cs.DB' full_name='Databases' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.'}
}

🤝 Friendship Links