Awesome
Large Language Models for Data Annotation: A Survey
-
This is a curated list of papers about LLM for Annotation
maintained by Zhen Tan (ztan36@asu.edu) and Alimohammad Beigi (abeigi@asu.edu).
-
If you want to add new entries, please make PRs with the same format.
-
This list serves as a complement to the survey below.
[Large Language Models for Data Annotation: A Survey]
<div align=center><img src="https://github.com/Zhen-Tan-dmml/LLM4Annotation/blob/main/figure/figure.png" width="500" /></div>If you find this repo helpful, we would appreciate it if you could cite our survey.
@misc{tan2024large,
title={Large Language Models for Data Annotation: A Survey},
author={Zhen Tan and Alimohammad Beigi and Song Wang and Ruocheng Guo and Amrita Bhattacharjee and Bohan Jiang and Mansooreh Karami and Jundong Li and Lu Cheng and Huan Liu},
year={2024},
eprint={2402.13446},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
LLM-Based Data Annotation
Manually Engineered Prompt
-
[EACL 2024] GPTs Are Multilingual Annotators for Sequence Generation Tasks. [pdf] [code]
-
[arXiv 2023] AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. [pdf]
-
[arXiv 2023] RAFT: Reward Ranked FineTuning for Generative Foundation Model Alignment. [pdf]
-
[arXiv 2023] Small Models are Valuable Plug-ins for Large Language Models. [pdf] [code]
-
[arXiv 2022] Language Models in the Loop: Incorporating Prompting into Weak Supervision. [pdf]
-
[EMNLP 2022] ZeroGen: Efficient Zero-shot Learning via Dataset Generation. [pdf] [code]
-
[NAACL-HLT 2022] Learning To Retrieve Prompts for In-Context Learning. [pdf] [code]
-
[EMNLP 2021] Constrained Language Models Yield Few-Shot Semantic Parsers. [pdf] [code]
Alignment via Pairwise Feedback
-
[ACL 2023] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers. [pdf] [code]
-
[arXiv 2023] Direct Preference Optimization: Your Language Model is Secretly a Reward Model. [pdf]
-
[NeurIPS 2022] Fine-tuning language models to find agreement among humans with diverse preferences. [pdf]
-
[arXiv 2022] Improving alignment of dialogue agents via targeted human judgements. [pdf]
-
[arXiv 2022] Teaching language models to support answers with verified quotes. [pdf] [data]
-
[NeurIPS 2020] Learning to summarize with human feedback. [pdf] [code]
-
[arXiv 2019] Fine-Tuning Language Models from Human Preferences. [pdf] [code]
Assessing LLM-Generated Annotations
Evaluating LLM-Generated Annotations
-
[EACL 2024] GPTs Are Multilingual Annotators for Sequence Generation Tasks. [pdf] [code]
-
[arXiv 2023] AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. [pdf]
-
[arXiv 2023] Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks. [pdf]
-
[NAACL 2022] LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework. [pdf] [code]
-
[EMNLP 2022] Large Language Models are Few-Shot Clinical Information Extractors. [pdf] [data]
-
[arXiv 2022] Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor. [pdf] [code]
-
[arXiv 2020] The Turking Test: Can Language Models Understand Instructions? [pdf]
Data Selection via Active Learning
-
[EMNLP 2023] FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models [pdf] [code]
-
[EMNLP 2023] Active Learning Principles for In-Context Learning with Large Language Models. [pdf]
-
[IUI 2023] ScatterShot: Interactive In-context Example Curation for Text Transformation. [pdf] [code]
-
[ICML 2023] Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning. [pdf] [code]
-
[arXiv 2023] Large Language Models as Annotators: Enhancing Generalization of NLP Models at Minimal Cost. [pdf]
-
[arXiv 2022] Active learning helps pretrained models learn the intended task. [pdf] [code]
-
[EACL 2021] Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates. [pdf]
Learning with LLM-Generated Annotations
Target Domain Inference: Direct Utilization of Annotations
-
[ECIR 2024] Large Language Models are Zero-Shot Rankers for Recommender Systems. [pdf] [code]
-
[arXiv 2023] Causal Reasoning and Large Language Models: Opening a New Frontier for Causality. [pdf]
-
[ACL 2022] An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels. [pdf] [code]
-
[TMLR 2022] Emergent Abilities of Large Language Models. [pdf]
-
[NeurIPS 2022] Large Language Models are Zero-Shot Reasoners. [pdf]
-
[arXiv 2022] Visual Classification via Description from Large Language Models. [pdf]
-
[PMLR 2021] Learning Transferable Visual Models From Natural Language Supervision. [pdf] [code]
-
[EMNLP 2019] Language Models as Knowledge Bases? [pdf] [code]
Knowledge Distillation: Bridging LLM and task-specific models
-
[EACL 2024] GPTs Are Multilingual Annotators for Sequence Generation Tasks. [pdf] [code]
-
[EMNLP 2023] Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. [pdf] [code]
-
[ACL 2023] Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes. [pdf] [code]
-
[ACL 2023] GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo. [pdf] [code]
-
[ACL 2023] GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model. [pdf] [code]
-
[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models. [pdf] [code]
-
[arXiv 2023] Specializing Smaller Language Models towards Multi-Step Reasoning. [pdf]
-
[arXiv 2023] Knowledge Distillation of Large Language Models. [pdf] [code]
-
[arXiv 2023] Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events. [pdf]
-
[arXiv 2023] Web Content Filtering through knowledge distillation of Large Language Models. [pdf]
-
[ICLR 2022] Knowledge Distillation of Large Language Models. [pdf] [code]
-
[arXiv 2022] Teaching Small Language Models to Reason. [pdf]
Harnessing LLM Annotations for Fine-Tuning and Prompting
In-Context Learning (ICL)
-
[EMNLP 2023] Active Learning Principles for In-Context Learning with Large Language Models. [pdf]
-
[ACL 2023] Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models. [pdf]
-
[ICLR 2022] Finetuned Language Models Are Zero-Shot Learners. [pdf] [code]
-
[ICLR 2022] Selective Annotation Makes Language Models Better Few-Shot Learners. [pdf] [code]
-
[NAACL 2022] Improving In-Context Few-Shot Learning via Self-Supervised Training. [pdf]
-
[arXiv 2022] Instruction Induction: From Few Examples to Natural Language Task Descriptions. [pdf] [code]
-
[NeurIPS 2020] Language Models are Few-Shot Learners. [pdf]
Chain-of-Thought Prompting (CoT)
-
[ICLR 2023] Automatic chain of thought prompting in large language models. [pdf] [code]
-
[ACL 2023] SCOTT: Self-Consistent Chain-of-Thought Distillation. [pdf]
-
[arXiv 2023] Specializing Smaller Language Models towards Multi-Step Reasoning. [pdf]
-
[NeurIPS 2022] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. [pdf]
-
[NeurIPS 2022] Large Language Models are Zero-Shot Reasoners. [pdf]
-
[arXiv 2022] Rationale-augmented ensembles in language models. [pdf]
-
[ACL 2020] A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers. [pdf] [code]
-
[NAACL 2019] CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge. [pdf] [code]
Instruction Tuning (IT)
-
[ACL 2023] Crosslingual Generalization through Multitask Finetuning. [pdf] [code]
-
[ACL 2023] SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions. [pdf] [code]
-
[ACL 2023] Can Large Language Models Be an Alternative to Human Evaluations? [pdf]
-
[arXiv 2023] LLaMA: Open and Efficient Foundation Language Models. [pdf][code]
-
[arXiv 2022] Teaching language models to support answers with verified quotes. [pdf] [data]
-
[arXiv 2022] Scaling instruction-finetuned language models. [pdf] [code]
-
[EMNLP 2022] Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. [pdf] [code]
-
[NeurIPS 2020] Language Models are Few-Shot Learners. [pdf]
-
Stanford alpaca: An instruction-following llama model. [HTML] [code]
Alignment Tuning (AT)
-
[PMLR 2023] Pretraining Language Models with Human Preferences. [pdf][code]
-
[ICLR 2023] Offline RL for Natural Language Generation with Implicit Language Q Learning. [pdf] [code]
-
[arXiv 2023] Chain of hindsight aligns language models with feedback. [pdf][code]
-
[arXiv 2023] GPT-4 Technical Report. [pdf]
-
[arXiv 2023] Llama 2: Open Foundation and Fine-Tuned Chat Models. [pdf] [code]
-
[arXiv 2023] RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback. [pdf]
-
[NeurIPS 2022] Training language models to follow instructions with human feedback. [pdf]
-
[arXiv 2022] Teaching language models to support answers with verified quotes. [pdf] [data]
-
[arXiv 2019] Fine-Tuning Language Models from Human Preferences. [pdf][code]
-
[arXiv 2019] CTRL: A Conditional Transformer Language Model for Controllable Generation. [pdf][code]
-
[NeurIPS 2017] Deep Reinforcement Learning from Human Preferences. [pdf]
Surveys
-
[ACM 2023] Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. [pdf]
-
[arXiv 2023] A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models. [pdf] [repo]
-
[arXiv 2022] A Survey of Large Language Models. [pdf] [repo]
-
[arXiv 2022] A Survey on In-context Learning. [pdf]
-
[arXiv 2022] A Comprehensive Survey on Instruction Following. [pdf] [repo]