Awesome
ChatGPT-related Papers
This is a list of ChatGPT-related papers. Any feedback is welcome.
Table of Contents
- Survey paper
- Instruction tuning
- Reinforcement learning from human feedback
- Evaluation
- Large Language Model
- External tools
- MoE/Routing
- Technical report of open/proprietary model
- Misc.
Survey paper
Instruction tuning
- Finetuned Language Models Are Zero-Shot Learners
- Scaling Instruction-Finetuned Language Models
- Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
- Self-Instruct: Aligning Language Model with Self Generated Instructions [github]
- Stanford Alpaca: An Instruction-following LLaMA Model [github]
- Dolly: Democratizing the magic of ChatGPT with open models [blog] [blog]
- Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality [github] [website]
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions [github]
- Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
- LIMA: Less Is More for Alignment
- Enhancing Chat Language Models by Scaling High-quality Instructional Conversations [github]
- How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources [github]
- Faith and Fate: Limits of Transformers on Compositionality
- SAIL: Search-Augmented Instruction Learning
- The False Promise of Imitating Proprietary LLMs
- Instruction Mining: High-Quality Instruction Data Selection for Large Language Models
- SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF (EMNLP2023 Findings)
Reinforcement learning from human feedback
- Fine-Tuning Language Models from Human Preferences [github] [blog]
- Training language models to follow instructions with human feedback [github] [blog]
- WebGPT: Browser-assisted question-answering with human feedback [blog]
- Improving alignment of dialogue agents via targeted human judgements
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
- OpenAssistant Conversations -- Democratizing Large Language Model Alignment [github]
- Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
- Preference Ranking Optimization for Human Alignment
- Training Language Models with Language Feedback (ACL2022 WS)
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
- A General Theoretical Paradigm to Understand Learning from Human Preferences
Evaluation
- How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
- Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
- A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
- Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
- Is ChatGPT a Good Causal Reasoner? A Comprehensive Evaluation
- Is ChatGPT a Good Recommender? A Preliminary Study
- Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness
- Semantic Compression With Large Language Models
- Human-like Summarization Evaluation with ChatGPT
- Sentence Simplification via Large Language Models
- Capabilities of GPT-4 on Medical Challenge Problems
- Do Multilingual Language Models Think Better in English?
- ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark
- ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
- Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks
- Can ChatGPT Reproduce Human-Generated Labels? A Study of Social Computing Tasks
- Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks
- Is GPT-3 a Good Data Annotator? (ACL2023)
- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Large Language Model
External tools
- Toolformer: Language Models Can Teach Themselves to Use Tools
- Large Language Models as Tool Makers
- CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation
- ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
MoE/Routing
- Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
- Mixtral of Experts
- Knowledge Fusion of Large Language Models (ICLR2024)
Technical report of open/proprietary model
- Llama 2: Open Foundation and Fine-Tuned Chat Models
- Qwen Technical Report
- Nemotron-4 15B Technical Report
- Nemotron-4 340B Technical Report
- PaLM 2 Technical Report