Awesome
<!-- omit in toc --> <h1 align="center"> Awesome Instruction Learning </h1> <p align="center"> <a href="https://github.com/RenzeLou/awesome-instruction-learning"><img src="https://awesome.re/badge.svg" alt="Awesome" /></a> <a href="https://github.com/RenzeLou/awesome-instruction-learning#-star-history"><img src="https://img.shields.io/github/stars/RenzeLou/awesome-instruction-learning?style=social" alt="Stars" /></a> </p> <p align="center"> <a href="https://github.com/RenzeLou/awesome-instruction-learning/commits/main"><img src="https://img.shields.io/github/last-commit/RenzeLou/awesome-instruction-learning?color=#00FA9A" alt="Commit" /></a> <a href="https://github.com/RenzeLou/awesome-instruction-learning/blob/main/count_number.py"><img src="https://img.shields.io/badge/PaperNumber-199-blue" alt="PaperNumber" /></a> <a href="https://github.com/RenzeLou/awesome-instruction-learning/pulls"><img src="https://img.shields.io/badge/PRs-Welcome-red" alt="PullRequests" /></a> </p> <!-- [![Awesome](https://awesome.re/badge.svg)](https://github.com/RenzeLou/awesome-instruction-learning) [![Stars](https://img.shields.io/github/stars/RenzeLou/awesome-instruction-learning?style=social)](https://github.com/RenzeLou/awesome-instruction-learning#-star-history) [![Commit](https://img.shields.io/github/last-commit/RenzeLou/awesome-instruction-learning?color=#00FA9A)](https://github.com/RenzeLou/awesome-instruction-learning/commits/main) [![PaperNumber](https://img.shields.io/badge/PaperNumber-161-blue)](https://github.com/RenzeLou/awesome-instruction-learning/blob/main/count_number.py) [![PullRequests](https://img.shields.io/badge/PRs-Welcome-red)](https://github.com/RenzeLou/awesome-instruction-learning/pulls) --> <p align="center"> π₯π₯π₯ An awesome reading list of <b>Instruction Tuning and Following</b>, including <em>papers</em> and <em>datasets</em>. </p> <p align="center"> <i> π Explore our latest survey update! Feel free to dive in and discover the improvements we've made π π€ : <a href="https://arxiv.org/abs/2303.10475"> <b>Latest Survey</b> </a> </i> </p> <!-- https://drive.google.com/file/d/1vrx3BSkHlkNO6_nP9G9l9Ape7vEoTOdf/view?usp=sharing --><!-- What is instruction learning? Why instruction learning? --> <!-- TODO add "must read" section to select a core subset of instruction tuning papers --> <!-- omit in toc -->
β€οΈ Contribution
This repository is currently maintained by <ins>Renze Lou @ PennState</ins> and <ins>Kai Zhang @ OhioState</ins>. We appreciate any contributions β€οΈ.
<!-- **<font color='red'>Work still in progress</font>** π, **we appreciate any suggestions and contributions** β€οΈ. -->If you have any suggestions or find any missed papers, feel free to reach out or submit a pull request:
- Use following markdown format.
**Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)].
<!-- >1. **Paper Title.** *Author 1, Author 2, and Author 3.* Conference/Journal/Preprint Year. [[pdf](link)]. -->
-
If one preprint paper has multiple versions, please use the earliest submitted year.
-
Display the papers in a year descending order (the latest, the first).
π₯³ Citation
Find this repository helpful? πππ
Please consider citing our paper. πππ
<!-- *(**Note that the current version of our survey is only a draft, and we are still working on it.** The first readable version is arriving soon.)* π -->@article{lou2023instruction,
title={A Comprehensive Survey on Instruction Following},
author={Lou, Renze and Zhang, Kai and Yin, Wenpeng},
journal={arXiv preprint arXiv:2303.10475},
year={2023}
}
<!-- omit in toc -->
π Table of Contents
- 1. ππ½ββοΈ Introduction
- 2. π Surveys and Tutorials
- 3. π Corpora
- 4. ποΈ Taxonomy
- 5. π Analyses
- 6. π€ Applications
- 7. π Extended Reading
1. ππ½ββοΈ Introduction
<div align="center"> <img src=./resources/introduction.png width=85% title="Instruction Learning vs. Full Supervised Learning" /> </div> <!-- <center> <img style="border-radius: 0.3125em; box-shadow: 0 2px 4px 0 rgba(34,36,38,.12),0 2px 10px 0 rgba(34,36,38,.08);" src="./resources/introduction.png"> <br> <div style="color:orange; border-bottom: 1px solid #d9d9d9; display: inline-block; color: #999; padding: 2px;">Full Supervised Learning vs. Instruction Learning</div> </center> -->Why instruction-driven learning instead of example-driven learning?
- π Affordable. For the conventional example-driven supervised learning, each <ins>downstream</ins> task usually requires extensive labeled examples π°. While for instruction learning, each <ins>downstream</ins> task may require only one instruction and just a few examples π€©.
- π One model, all tasks. An ideal AI system should be able to quickly understand and handle various new tasks π«.
- π A promising research direction. Traditional example-driven supervised learning uses labeled instances to represent the task semantics, i.e., training models by observing numerous examples to recover the original task meaning. Therefore, why not directly use the task instruction, which has already occupied the essential task semantics?
2. π Surveys and Tutorials
<!-- There are several awesome surveys and tutorials on textual instruction learning. --> <!-- To our knowledge, our survey is the first one to provide a comprehensive and broader overview of the field of instruction learning. --> <!-- Since each survey focuses on specific in-context instruction, we attach a label to each of them to distinguish these topics. , including `prompt`, `demonstrations`, `reasoning`, and `overview` (which means a broader perspective). -->We use the label to denote the papers with a more comprehensive perspective. While some other papers are more specific to a certain in-context instruction, including , few-shot , and CoT .
-
A Comprehensive Survey on Instruction Following. Renze Lou, Kai Zhang, and Wenpeng Yin. <ins>Preprint</ins> 2023. [pdf]; [paper list].
-
Learning from Task Instructions. Wenpeng Yin, Qinyuan Ye, Pengfei Liu, Xiang Ren, and Hinrich SchΓΌtze. <ins>EMNLP Tutorial</ins> 2023. [pdf].
-
Nature Language Reasoning, A Survey. Fei Yu, Hongbo Zhang, and Benyou Wang. <ins>Preprint</ins> 2023. [pdf]; [paper list].
-
Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. <ins>ACM Computing Surveys</ins> 2023. [pdf]; [website].
-
A Survey on In-context Learning. Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, and Zhifang Sui. <ins>Preprint</ins> 2022. [pdf].
-
Towards Reasoning in Large Language Models: A Survey. Jie Huang, and Kevin Chen-Chuan Chang. <ins>Preprint</ins> 2022. [pdf]; [paper list].
-
Reasoning with Language Model Prompting: A Survey. Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, and Huajun Chen. <ins>Preprint</ins> 2022. [pdf]; [paper list].
3. π Corpora
The high-quality dataset is the key factor for successful instruction tuning. Therefore, we put the "corpora" section here to emphasize its importance.
We carefully design the following table, make it easy to be referred to, and keep it up-to-date. Hope it can contribute to future research of instruction tuning. π€
(Some rows come from Longpre et al., thanks for their great work β€οΈ.)
<table id="copora-table" style="height: 353px; width: 690px;" width="629"> <tbody> <tr style="height: 37px;"> <td style="height: 47px; width: 124.992px; text-align: left;" rowspan="2"><strong>Name </strong></td> <td style="height: 47px; width: 61.2891px; text-align: right;" rowspan="2"><strong>Release</strong></td> <td style="height: 47px; width: 85.1875px; text-align: center;" rowspan="2"><strong>Data/Code</strong></td> <td style="height: 37px; width: 144.289px; text-align: center;" colspan="2"><strong>Scale</strong></td> <td style="height: 47px; width: 109.258px; text-align: center;" rowspan="2"><strong>Language</strong></td> <td style="width: 124.984px; text-align: center; height: 47px;" rowspan="2"><strong>Annotator</strong></td> </tr> <tr style="height: 10px;"> <td style="height: 10px; width: 60.7969px; text-align: right;"><strong>#Tasks</strong></td> <td style="height: 10px; width: 77.4922px; text-align: right;"><strong>#Ins. (K)</strong></td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2005.00700.pdf">UnifiedQA</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">05/2020</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/allenai/unifiedqa">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">46</td> <td style="height: 18px; width: 77.4922px; text-align: right;">750</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2104.08835.pdf">CrossFit</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2021</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/INK-USC/CrossFit">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">159</td> <td style="height: 18px; width: 77.4922px; text-align: right;">71,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2104.08773.pdf">Natural Inst. v1</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2021</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://instructions.apps.allenai.org/">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">61</td> <td style="height: 18px; width: 77.4922px; text-align: right;">620</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2109.01652.pdf">Flan 2021</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">09/2021</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/google-research/FLAN/tree/main#flan-2021">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">62</td> <td style="height: 18px; width: 77.4922px; text-align: right;">4,400</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2202.01279.pdf">P3</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">10/2021</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/bigscience/P3">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">62</td> <td style="height: 18px; width: 77.4922px; text-align: right;">12,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2110.15943.pdf">MetaICL</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">10/2021</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/facebookresearch/MetaICL">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">142</td> <td style="height: 18px; width: 77.4922px; text-align: right;">3,500</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://openreview.net/pdf?id=Vzh1BFUCiIX">ExMix</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">11/2021</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/google-research/text-to-text-transfer-transformer">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">107</td> <td style="height: 18px; width: 77.4922px; text-align: right;">500</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"> <p><a href="https://arxiv.org/pdf/2204.07705.pdf">SuperNI</a></p> <p><a href="https://arxiv.org/pdf/2204.07705.pdf">(Natural Inst. v2)</a></p> </td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://instructions.apps.allenai.org/">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">1,613</td> <td style="height: 18px; width: 77.4922px; text-align: right;">5,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/multilingual-red" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2210.02414.pdf">GLM</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">10/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/THUDM/GLM-130B">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">77</td> <td style="height: 18px; width: 77.4922px; text-align: right;">12,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/bilingual-yellow" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2301.13688.pdf">Flan 2022</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">10/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/google-research/FLAN/tree/main/flan/v2">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">1,836</td> <td style="height: 18px; width: 77.4922px; text-align: right;">15,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/multilingual-red" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2211.01786.pdf">xP3</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">11/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/bigscience/xP3">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">71</td> <td style="height: 18px; width: 77.4922px; text-align: right;">81,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/multilingual-red" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2212.09689.pdf">Unnatural Inst.</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">12/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/orhonovich/unnatural-instructions">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">117</td> <td style="height: 18px; width: 77.4922px; text-align: right;">64</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;"> <p>π€ InstructGPT<sub>002</sub></p> <p><sub><code>text-davinci-002</code></sub></p> </td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2212.10560.pdf">Self-Instruct</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">12/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/yizhongw/self-instruct">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">/</td> <td style="height: 18px; width: 77.4922px; text-align: right;">82</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;"> <p>π€ GPT-3 </p> <p><code><sub>davinci</sub></code></p> </td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2212.12017.pdf">OPT-IML</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">12/2022</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;">/</td> <td style="height: 18px; width: 60.7969px; text-align: right;">2,207</td> <td style="height: 18px; width: 77.4922px; text-align: right;">18,000</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/multilingual-red" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://crfm.stanford.edu/2023/03/13/alpaca.html">Alpaca</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">03/2023</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/tatsu-lab/stanford_alpaca">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">/</td> <td style="height: 18px; width: 77.4922px; text-align: right;">52</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;"> <p>π€ InstructGPT<sub>003</sub></p> <p><sub><code>text-davinci-003</code></sub></p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2304.01196.pdf">Baize</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/project-baize/baize-chatbot/tree/main/data">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">100</td> <td style="width: 109.258px; text-align: center;"> <p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> <p><img src="https://img.shields.io/badge/dialogue-%E2%9C%94-lightgreen" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://bair.berkeley.edu/blog/2023/04/03/koala/">Koala</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;">/</td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">/</td> <td style="width: 109.258px; text-align: center;"> <p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> <p><img src="https://img.shields.io/badge/dialogue-%E2%9C%94-lightgreen" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf">GPT4All</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">808</td> <td style="width: 109.258px; text-align: center;"> <p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> <p><img src="https://img.shields.io/badge/dialogue-%E2%9C%94-lightgreen" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> <p>π€ ChatGPT</p> </td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2304.03277.pdf">Alpaca-gpt4</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">/</td> <td style="height: 18px; width: 77.4922px; text-align: right;">113</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/bilingual-yellow" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;"> <p>π€ GPT-4 </p> <p><sub><code>gpt-4</code></sub></p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://vicuna.lmsys.org/">Vicuna</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;">/</td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">76</td> <td style="width: 109.258px; text-align: center;"> <p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> <p><img src="https://img.shields.io/badge/dialogue-%E2%9C%94-lightgreen" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> <p>π€ ChatGPT</p> </td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm">Dolly</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://github.com/databrickslabs/dolly/tree/master/data">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">/</td> <td style="height: 18px; width: 77.4922px; text-align: right;">15</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;">β Human</td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://drive.google.com/file/d/10iR5hKwFqAKhL3umx8muOWSRm7hs5FqX/view">Oasst</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/OpenAssistant/oasst1">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">84</td> <td style="width: 109.258px; text-align: center;"> <p><img src="https://img.shields.io/badge/multilingual-red" alt="" /></p> <p><img src="https://img.shields.io/badge/dialogue-%E2%9C%94-lightgreen" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;">β Human</td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2304.08460.pdf">LongForm</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://github.com/akoksal/LongForm">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">27</td> <td style="width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> <p>π€ InstructGPT<sub>003</sub></p> <p><sub><code>text-davinci-003</code></sub></p> </td> </tr> <tr style="height: 18px;"> <td style="height: 18px; width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2304.07995.pdf">Symbolic-Instruct</a></td> <td style="height: 18px; width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="height: 18px; width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/sail/symbolic-instruction-tuning">Link</a></td> <td style="height: 18px; width: 60.7969px; text-align: right;">/</td> <td style="height: 18px; width: 77.4922px; text-align: right;">796</td> <td style="height: 18px; width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></td> <td style="width: 124.984px; text-align: center; height: 18px;"> <p>β Human</p> <p>Synthetic Examples</p> </td> </tr> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2304.14402.pdf">LaMini</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/MBZUAI/LaMini-instruction">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">2,580</td> <td style="width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2304.12244.pdf">WizardLM</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://github.com/nlpxucan/WizardLM">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">196</td> <td style="width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2305.09857.pdf">COEDIT</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">05/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://github.com/vipulraheja/coedit">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">82</td> <td style="width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> <!-- <p>collecting from existing text-editing datasets</p> --> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2305.14233.pdf">UltraChat</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">05/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/stingning/ultrachat">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">1,500</td> <td style="width: 109.258px; text-align: center;"> <p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> <p><img src="https://img.shields.io/badge/dialogue-%E2%9C%94-lightgreen" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2305.14045.pdf">CoT Collection</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">05/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://github.com/kaistAI/CoT-Collection">Link</a></td> <td style="width: 60.7969px; text-align: right;">1,060</td> <td style="width: 77.4922px; text-align: right;">1,880</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ Codex</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2305.14327.pdf">Dynosaur</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">05/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://dynosaur-it.github.io/">Link</a></td> <td style="width: 60.7969px; text-align: right;">5,740</td> <td style="width: 77.4922px; text-align: right;">801</td> <td style="width: 109.258px; text-align: center;"><img src="https://img.shields.io/badge/monolingual-informational" alt="" /> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://renzelou.github.io/Muffin/">MUFFIN</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">10/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/Reza8848/MUFFIN_68k">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">68</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> <p>π€ GPT-4 </p> <p>β Human</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2310.19651.pdf">Dynamics-of-Instruction</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">10/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/ChiyuSONG/dynamics-of-instruction-tuning">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">40</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2311.13246.pdf">CoachLM</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">11/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://github.com/lunyiliu/CoachLM">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">2</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>β Human</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2312.15685.pdf">DEITA</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">12/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://github.com/hkust-nlp/deita">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">10</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/pdf/2312.14187.pdf">WaveCoder</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">12/2023</span></td> <td style="width: 85.1875px; text-align: center;"><a href="">Link</a></td> <td style="width: 60.7969px; text-align: right;">4 code-related tasks</td> <td style="width: 77.4922px; text-align: right;">20</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ ChatGPT</p> <p>π€ GPT-4</p> </td> </tr> <tr> <td style="width: 124.992px; text-align: left;"><a href="https://arxiv.org/abs/2404.02823">Conifer</a></td> <td style="width: 61.2891px; text-align: right;"><span style="text-decoration: underline;">04/2024</span></td> <td style="width: 85.1875px; text-align: center;"><a href="https://huggingface.co/datasets/ConiferLM/Conifer">Link</a></td> <td style="width: 60.7969px; text-align: right;">/</td> <td style="width: 77.4922px; text-align: right;">13</td> <td style="width: 109.258px; text-align: center;"><p><img src="https://img.shields.io/badge/monolingual-informational" alt="" /></p> </td> <td style="width: 124.984px; text-align: center;"> <p>π€ GPT-4</p> </td> </tr> </tbody> </table> <!-- Some Notes for the alpaca: 1. The stanford-alpaca paper is not yet published. It is mainly based on the data generation pipline of Self-Instruct, where the main difference is the author uses InstructGPT-3.5 (text-davinci-003) to replace the GPT-3 (davinci). Besides, they also change the prompt, decoding strategy, and remove the cls tasks discrimination. 2. Alpaca-gpt4 is based on alpaca, the 52k english instructions (w/ optional inputs) are directly collected from alpaca. The main differences are: (a) using ChatGPT to translate 52k English instructions to parallel Chinese instructions (w/ optional inputs); (b) using GPT-4 to replace GPT-3.5 to annotate the outputs of these bilingual instructions; (c) additionally adopting the data generation pipline of Unnatural Instructions with GPT-4 as the annotation model. --> <!-- Since I have already displayed the following data-related papers in the table above, I will not list them explicitly here. 1. **Self-Instruct: Aligning Language Model with Self Generated Instructions.** *Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, and Hannaneh Hajishirzi.* <ins>Preprint</ins> 2022. [[pdf](https://arxiv.org/pdf/2212.10560.pdf)]; [[corpus](https://github.com/yizhongw/self-instruct)]. 2. **Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor.** *Or Honovich, Thomas Scialom, Omer Levy, and Timo Schick.* <ins>Preprint</ins> 2022. [[pdf](https://arxiv.org/pdf/2212.09689.pdf)]; [[corpus](https://github.com/orhonovich/unnatural-instructions)]. 3. **Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks.** *Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, and et al.* <ins>EMNLP</ins> 2022. [[pdf](https://arxiv.org/pdf/2204.07705.pdf)]; [[corpus](https://instructions.apps.allenai.org/)]. 4. **Cross-Task Generalization via Natural Language Crowdsourcing Instructions.** *Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi.* <ins>ACL</ins> 2022. [[pdf](https://aclanthology.org/2022.acl-long.244.pdf)]; [[corpus](https://instructions.apps.allenai.org/)]. 5. **PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts.** *Stephen Bach, Victor Sanh, Zheng Xin Yong, and et al.* <ins>ACL</ins> 2022. [[pdf](https://aclanthology.org/2022.acl-demo.9.pdf)]; [[toolkit](https://github.com/bigscience-workshop/promptsource)]; [[corpus](https://huggingface.co/datasets/bigscience/P3)]. -->4. ποΈ Taxonomy
In our paper, we divide the textual instructions into three categories.
4.1 Entailment-oriented Instruction
<!-- Entailment-oriented instruction constructs the task output into the hypothesis and regards the origin task input as the premise. For example, the origin task is to classify `I love this movie` to a `positive` label. While the entailment-oriented paradigm aims to classify whether `Premise: I love this movie` and `Hypothesis: Is it sentiment-positive?` are entailed. --> <!-- For example, `Premise: I love this movie` and `Hypothesis: Is it sentiment-positive?` -->Entailment-oriented instruction regards the task input as the premise, and constructs the task output into the hypothesis. It unifies the conventional classification problems into a textual entailment paradigm.
-
A Universal Discriminator for Zero-Shot Generalization. Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, and Zhilin Yang. <ins>ACL</ins> 2023. [pdf]; [code].
-
ConEntail: An Entailment-based Framework for Universal Zero and Few Shot Classification with Supervised Contrastive Pretraining. Ranran Haoran Zhang, Aysa Xuemo Fan, and Rui Zhang. <ins>EACL</ins> 2023. [pdf]; [code].
-
OpenStance: Real-world Zero-shot Stance Detection. Hanzi Xu, Slobodan Vucetic, and Wenpeng Yin. <ins>CoNLL</ins> 2022. [pdf]; [code].
-
Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference. Bangzheng Li, Wenpeng Yin, and Muhao Chen. <ins>TACL</ins> 2022. [pdf]; [code].
-
Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning. Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, and Eneko Agirre. <ins>Findings of NAACL</ins> 2022. [pdf]; [code].
-
Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction. Oscar Sainz, Oier Lopez de Lacalle, Gorka Labaka, Ander Barrena, and Eneko Agirre. <ins>EMNLP</ins> 2021. [pdf]; [code].
-
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections. Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein. <ins>Findings of EMNLP</ins> 2021. [pdf]; [code].
-
Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System. Congying Xia, Wenpeng Yin, Yihao Feng, and Philip Yu. <ins>NAACL</ins> 2021. [pdf]; [code].
-
ExpBERT: Representation Engineering with Natural Language Explanations. Shikhar Murty, Pang Wei Koh, and Percy Liang. <ins>ACL</ins> 2020. [pdf]; [code].
-
Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach. Wenpeng Yin, Jamaal Hay, Dan Roth . <ins>EMNLP</ins> 2019. [pdf]; [website].
4.2 PLM-oriented Instruction
PLM-oriented instruction (i.e., prompt) aims to construct a cloze-style input to steer pre-trained language models (PLM) for responses. Here, we diaplay several representative works of PLM-oriented instruction learning. For more works, please refer to this repository and this survey.
-
How Does In-Context Learning Help Prompt Tuning? Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, and Mohit Iyyer. <ins>Preprint</ins> 2023. [pdf].
-
Demystifying Prompts in Language Models via Perplexity Estimation. Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, and Luke Zettlemoyer. <ins>Preprint</ins> 2022. [pdf].
-
RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning. Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, and et al. <ins>EMNLP</ins> 2022. [pdf]; [code].
-
PPT: Pre-trained Prompt Tuning for Few-shot Learning. Yuxian Gu, Xu Han, Zhiyuan Liu, and Minlie Huang. <ins>ACL</ins> 2022. [pdf]; [code].
-
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang. <ins>ACL</ins> 2022. [pdf]; [code].
-
KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction. Xiang Chen, Ningyu Zhang, Xin Xie, and et al. <ins>WWW</ins> 2022. [pdf]; [code].
-
GPT Understands, Too. Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang. <ins>Preprint</ins> 2021. [pdf]; [code].
-
Few-Shot Text Generation with Natural Language Instructions. Timo Schick and Hinrich SchΓΌtze. <ins>EMNLP</ins> 2021. [pdf]; [code].
-
Itβs Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. Timo Schick and Hinrich SchΓΌtze. <ins>NAACL</ins> 2021. [pdf]; [code].
-
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts. Guanghui Qin and Jason Eisner. <ins>NAACL</ins> 2021. [pdf]; [code].
-
Prefix-Tuning: Optimizing Continuous Prompts for Generation. Xiang Lisa Li and Percy Liang. <ins>ACL</ins> 2021. [pdf]; [code].
-
Making Pre-trained Language Models Better Few-shot Learners. Tianyu Gao, Adam Fisch, and Danqi Chen. <ins>ACL</ins> 2021. [pdf]; [code].
-
Template-Based Named Entity Recognition Using BART. Leyang Cui, Yu Wu, Jian Liu, Sen Yang, and Yue Zhang. <ins>Findings of ACL</ins> 2021. [pdf]; [code].
-
Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. Timo Schick and Hinrich SchΓΌtze. <ins>EACL</ins> 2021. [pdf]; [code].
-
Language Models are Unsupervised Multitask Learners. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. <ins>Preprint</ins> 2019. [pdf].
4.3 Human-oriented Instruction
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->Human-oriented instruction is initially designed for human to understand the task and annotate the data, such as the Amazon MTurk Instructions, which provides sufficient information about the task (e.g., detailed definition).
-
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors. Kai Zhang, Bernal JimΓ©nez GutiΓ©rrez, and Yu Su. <ins>Findings of ACL</ins> 2023. [pdf]; [code].
-
Symbol tuning improves in-context learning in language models. Jerry Wei, Le Hou, Andrew Lampinen, Xiangning Chen, and et al. <ins>Preprint</ins> 2023. [pdf].
-
Small Models are Valuable Plug-ins for Large Language Models. Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, and Julian McAuley. <ins>Preprint</ins> 2023. [pdf]; [code].
-
How Many Data Samples is an Additional Instruction Worth? Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, and Chitta Baral. <ins>Findings of EACL</ins> 2023. [pdf]; [code].
-
In-Context Instruction Learning. Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, and Minjoon Seo. <ins>Preprint</ins> 2023. [pdf]; [code].
-
InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis. Kevin Scaria, Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, and Chitta Baral. <ins>Preprint</ins> 2023. [pdf]; [code].
-
HINT: Hypernetwork Instruction Tuning for Efficient Zero-Shot Generalisation. Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, and Matthew Peters. <ins>Preprint</ins> 2022. [pdf].
-
Boosting Natural Language Generation from Instructions with Meta-Learning. Budhaditya Deb, Guoqing Zheng, and Ahmed Hassan Awadallah. <ins>Preprint</ins> 2022. [pdf].
-
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models. Archiki Prasad, Peter Hase, Xiang Zhou, and Mohit Bansal. <ins>Preprint</ins> 2022. [pdf]; [code].
-
ConTinTin: Continual Learning from Task Instructions. Wenpeng Yin, Jia Li, and Caiming Xiong. <ins>ACL</ins> 2022. [pdf].
-
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning. Prakhar Gupta, Cathy Jiao, Yi-Ting Yeh, Shikib Mehri, Maxine Eskenazi, and Jeffrey P. Bigham. <ins>EMNLP</ins> 2022. [pdf]; [code].
-
Learning to Generate Task-Specific Adapters from Task Description. Qinyuan Ye and Xiang Ren. <ins>ACL</ins> 2021. [pdf]; [code]. <!-- TODO -->
-
The Turking Test: Can Language Models Understand Instructions? Avia Efrat and Omer Levy. <ins>Preprint</ins> 2020. [pdf].
5. π Analyses
5.1 Scale
The model and task scale are found to be important for instruction-based fine-tuning. Basically, the larger model scale brings more benefits to the generalization, and so does the task scale. However, some works raised objections (e.g., Jang et al. and Wang et al.).
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->-
Exploring the Benefits of Training Expert Language Models over Instruction Tuning. Joel Jang, Seungone Kim, Seonghyeon Ye, and et al. <ins>Preprint</ins> 2023. [pdf]; [code].
-
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. Shayne Longpre, Le Hou, Tu Vu, and et al. <ins>Preprint</ins> 2023. [pdf]; [code]; [corpus].
-
UL2: Unifying Language Learning Paradigms. Yi Tay, Mostafa Dehghani, Vinh Q. Tran, and et al. <ins>Preprint</ins> 2022. [pdf]; [checkpoint].
-
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization. Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, and et al. <ins>Preprint</ins> 2022. [pdf].
-
Scaling Instruction-Finetuned Language Models. Hyung Won Chung, Le Hou, Shayne Longpre, and et al. <ins>Preprint</ins> 2022. [pdf]; [checkpoint].
-
Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization. Yuxian Gu, Pei Ke, Xiaoyan Zhu, and Minlie Huang. <ins>EMNLP</ins> 2022. [pdf]; [code].
-
Emergent Abilities of Large Language Models. Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, and et al. <ins>TMLR</ins> 2022. [pdf].
-
Multitask Prompted Training Enables Zero-Shot Task Generalization. Victor Sanh, Albert Webson, Colin Raffel, and et al. <ins>ICLR</ins> 2022. [pdf]; [checkpoint]; [corpus].
-
Finetuned Language Models are Zero-Shot Learners. Jason Wei, Maarten Bosma, Vincent Zhao, and et al. <ins>ICLR</ins> 2022. [pdf]; [code].
-
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks. Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, and Heng Ji. <ins>Preprint</ins> 2022. [pdf]; [code].
-
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization. Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, and Zhilin Yang. <ins>Preprint</ins> 2022. [pdf].
-
The Power of Scale for Parameter-Efficient Prompt Tuning. Brian Lester, Rami Al-Rfou, and Noah Constant. <ins>EMNLP</ins> 2021. [pdf]; [code].
5.2 Explanability
We exhibit works that focus on the interpretability and reliability of instruction learning, i.e., explaining when and why instruction can take effect.
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->-
What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning. Jane Pan, Tianyu Gao, Howard Chen, and Danqi Chen. <ins>Findings of ACL</ins> 2023. [pdf]; [code].
-
REV: Information-Theoretic Evaluation of Free-Text Rationales. Hanjie Chen, Faeze Brahman, Xiang Ren, and et al. <ins>ACL</ins> 2023. [pdf]; [code].
-
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca. Zhengxuan Wu, Atticus Geiger, Christopher Potts, and Noah D. Goodman. <ins>Preprint</ins> 2023. [pdf]; [code].
-
Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning. Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, and William Yang Wang. <ins>Preprint</ins> 2023. [pdf]; [code].
-
The Learnability of In-Context Learning. Noam Wies, Yoav Levine, and Amnon Shashua. <ins>Preprint</ins> 2023. [pdf].
-
Why think step-by-step? Reasoning emerges from the locality of experience. Ben Prystawski, and Noah D. Goodman. <ins>Preprint</ins> 2023. [pdf].
-
Larger language models do in-context learning differently. Jerry Wei, Jason Wei, Yi Tay, and et al. <ins>Preprint</ins> 2023. [pdf].
-
ββWhat learning algorithm is in-context learning? Investigations with linear models. Ekin AkyΓΌrek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou. <ins>ICLR</ins> 2023. [pdf]; [code].
-
Can language models learn from explanations in context? Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, and et al. <ins>Findings of EMNLP</ins> 2022. [pdf].
-
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. <ins>EMNLP</ins> 2022. [pdf]; [code].
-
Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts. Daniel Khashabi, Xinxi Lyu, Sewon Min, and et al. <ins>NAACL</ins> 2022. [pdf]; [code].
-
Do Prompt-Based Models Really Understand the Meaning of Their Prompts?. Albert Webson and Ellie Pavlick. <ins>NAACL</ins> 2022. [pdf]; [code].
-
Reframing Instructional Prompts to GPTkβs Language. Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, and Hannaneh Hajishirzi. <ins>Findings of ACL</ins> 2022. [pdf]; [code].
-
What Makes Good In-Context Examples for GPT-3? Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. <ins>ACL Workshop</ins> 2022. [pdf]; [code].
-
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. <ins>ACL</ins> 2022. [pdf].
-
Calibrate Before Use: Improving Few-shot Performance of Language Models. Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. <ins>ICML</ins> 2021. [pdf]; [code].
5.3 Robustness and Safety
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->-
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection. Jun Yan, Vikas Yadav, Shiyang Li, and et al. <ins>Workshop @ NeurIPS</ins> 2023. [pdf].
-
Evaluating the Zero-shot Robustness ofInstruction-tuned Language Models. Jiuding Sun, Chantal Shaib, and Byron C. Wallace. <ins>Preprint</ins> 2023. [pdf].
-
Poisoning Language Models During Instruction Tuning. Alexander Wan, Eric Wallace, Sheng Shen, and Dan Klein. <ins>ICML</ins> 2023. [pdf]; [code].
-
Multi-step Jailbreaking Privacy Attacks on ChatGPT. Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang, Fanpu Meng, and Yangqiu Song. <ins>Preprint</ins> 2023. [pdf].
-
More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models. Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. <ins>Preprint</ins> 2023. [pdf]; [code].
-
Robustness of Learning from Task Instructions. Jiasheng Gu, Hanzi Xu, Liangyu Nie, and Wenpeng Yin. <ins>Preprint</ins> 2022. [pdf].
-
Learning from Task Descriptions. Orion Weller, Nicholas Lourie, Matt Gardner, and Matthew E. Peters. <ins>EMNLP</ins> 2020. [pdf]; [code]; [corpus].
5.4 Evaluation
Stop using old-school automatic metrics to evaluate your instruction-tuned system; try more advanced methods to do it comprehensively!
-
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2. Hamish Ivison, Yizhong Wang, Valentina Pyatkin, and et al. <ins>Preprint</ins> 2023. [pdf]; [model&data]
-
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources. Yizhong Wang, Hamish Ivison, Pradeep Dasigi, and et al. <ins>NeurIPS Datasets and Benchmarks</ins> 2023. [pdf]; [code].
-
Instruction-following Evaluation through Verbalizer Manipulation. Shiyang Li, Jun Yan, Hai Wang, Zheng Tang, Xiang Ren, Vijay Srinivasan, Hongxia Jin <ins>Preprint</ins> 2023. [pdf].
-
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models. Yew Ken Chia, Pengfei Hong, Lidong Bing, and Soujanya Poria. <ins>Preprint</ins> 2023. [pdf]; [code]; [leaderboard].
5.5 Negation
Negation expressions, such as do not
and avoid doing
, are difficult for models to corretly understand and follow.
-
Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts. Joel Jang, Seonghyeon Ye, and Minjoon Seo. <ins>ICML Workshop</ins> 2023. [pdf].
-
Understanding by Understanding Not: Modeling Negation in Language Models. Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, and et al. <ins>NAACL</ins> 2021. [pdf]; [code].
5.6 Complexity
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->Papers are focusing on enhancing the complexity of instructions to enhance model competence. More complex data in the mix of instruction data, more competent performance model could achieve.
-
Wizardlm: Empowering large language models to follow complex instructions. Xu, Can and Sun, Qingfeng and Zheng, Kai and Geng, Xiubo and Zhao, Pu and Feng, Jiazhan and Tao, Chongyang and Jiang, Daxin. <ins>Prepint</ins> 2023. [pdf]; [code].
-
Orca: Progressive learning from complex explanation traces of gpt-4. Mukherjee, Subhabrata and Mitra, Arindam and Jawahar, Ganesh and Agarwal, Sahaj and Palangi, Hamid and Awadallah, Ahmed. <ins>Prepint</ins> 2023. [pdf].
-
A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment. Zhao, Yingxiu and Yu, Bowen and Hui, Binyuan and Yu, Haiyang and Huang, Fei and Li, Yongbin and Zhang, Nevin L. <ins>Prepint</ins> 2023. [pdf]; [code].
5.7 Other Papers
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->-
Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions. Mihir Parmar, Swaroop Mishra, Mor Geva, and Chitta Baral. <ins>EACL</ins> 2023. [pdf]; [code].
-
Instruction Tuned Models are Quick Learners. Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, et al. <ins>Preprint</ins> 2023. [pdf]; [code].
-
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning. Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin Raffel. <ins>NeurIPS</ins> 2022. [pdf]; [code].
-
A Survey of NLP-Related Crowdsourcing HITs: what works and what does not. Jessica Huynh, Jeffrey Bigham, and Maxine Eskenazi. <ins>Preprint</ins> 2021. [pdf].
6. π€ Applications
6.1 Human-Computer Interaction
Instructions are used in various human-computer interaction (HCI) tasks, such as virtual assistants, chatbots, etc.
-
Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing. Tuhin Chakrabarty, Vishakh Padmakumar, and He He. <ins>EMNLP</ins> 2022. [pdf]; [code].
-
HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models. Swaroop Mishra, and Elnaz Nouri. <ins>Preprint</ins> 2022. [pdf].
-
EditEval: An Instruction-Based Benchmark for Text Improvements. Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang, and et al. <ins>Preprint</ins> 2022. [pdf]; [code]; [website].
-
Communicating Natural Programs to Humans and Machines. Sam Acquaviva, Yewen Pu, Marta Kryven, and et al. <ins>NeurIPS Workshop</ins> 2022. [pdf]; [code].
-
Interactive Task Learning from GUI-Grounded Natural Language Instructions and Demonstrations. Toby Jia-Jun Li, Tom Mitchell, and Brad Myers. <ins>ACL Demo</ins> 2020. [pdf]; [code]; [video].
-
Multi-Modal Interactive Task Learning from Demonstrations and Natural Language Instructions. Toby Jia-Jun Li. <ins>UIST</ins> 2020. [pdf]; [code].
-
Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following. David Gaddy, and Dan Klein. <ins>ACL</ins> 2019. [pdf].
-
VirtualHome: Simulating Household Activities via Programs. Xavier Puig, Kevin Ra, Marko Boben, and et al. <ins>CVPR</ins> 2018. [pdf]; [website].
-
Natural Language Communication with Robots. Yonatan Bisk, Deniz Yuret, and Daniel Marcu. <ins>NAACL</ins> 2016. [pdf]; [website].
-
Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World. Jayant Krishnamurthy, and Thomas Kollar. <ins>TACL</ins> 2013. [pdf]; [code].
-
Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions. Yoav Artzi, and Luke Zettlemoyer. <ins>TACL</ins> 2013. [pdf].
-
Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision. Joohyun Kim, and Raymond Mooney. <ins>EMNLP</ins> 2012. [pdf].
-
A joint model of language and perception for grounded attribute learning. Cynthia Matuszek, Nicholas FitzGerald, Luke Zettlemoyer, Liefeng Bo, and Dieter Fox. <ins>ICML</ins> 2012. [pdf].
-
Learning to Interpret Natural Language Instructions. Monica BabeΕ-Vroman, James MacGlashan, Ruoyuan Gao, and et al. <ins>ACL Workshop</ins> 2012. [pdf].
-
Fast Online Lexicon Learning for Grounded Language Acquisition. David Chen. <ins>ACL</ins> 2012. [pdf].
-
Learning to Win by Reading Manuals in a Monte-Carlo Framework. S.R.K. Branavan, David Silver, and Regina Barzilay. <ins>ACL</ins> 2011. [pdf]; [website].
-
Learning from natural instructions. Dan Goldwasse, and Dan Roth. <ins>IJCAI</ins> 2011. [pdf].
-
Learning to Interpret Natural Language Navigation Instructions from Observations. David L. Chen and Raymond J. Mooney. <ins>AAAI</ins> 2011. [pdf].
-
Approaching the Symbol Grounding Problem with Probabilistic Graphical Models. Stefanie Tellex, Thomas Kollar, Steven Dickerson, and et al. <ins>AAAI</ins> 2011. [pdf].
-
Driving Semantic Parsing from the Worldβs Response. James Clarke, Dan Goldwasser, Ming-Wei Chang, and Dan Roth. <ins>CoNLL</ins> 2010. [pdf].
-
Learning to Follow Navigational Directions. Adam Vogel, and Daniel Jurafsky. <ins>ACL</ins> 2010. [pdf].
-
Reading between the Lines: Learning to Map High-Level Instructions to Commands. S.R.K. Branavan, Luke Zettlemoyer, and Regina Barzilay. <ins>ACL</ins> 2010. [pdf]; [website].
-
Reading to Learn: Constructing Features from Semantic Abstracts. Jacob Eisenstein, James Clarke, Dan Goldwasser, and Dan Roth. <ins>EMNLP</ins> 2009. [pdf]; [website].
-
Learning Semantic Correspondences with Less Supervision. Percy Liang, Michael Jordan, and Dan Klein. <ins>ACL</ins> 2009. [pdf].
-
Reinforcement Learning for Mapping Instructions to Actions. S.R.K. Branavan, Harr Chen, Luke Zettlemoyer, and Regina Barzilay. <ins>ACL</ins> 2009. [pdf]; [website].
-
Learning to sportscast: a test of grounded language acquisition. David L. Chen and Raymond J. Mooney. <ins>ICML</ins> 2008. [pdf].
-
Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer. Gregory Kuhlmann, Peter Stone, Raymond Mooney, and Jude Shavlik. <ins>AAAI Workshop</ins> 2004. [pdf]; [website].
6.2 Data and Feature Augmentation
Some instructions (e.g., label explanations) are also be used for automatic annotation (i.e., data augmentation), or for enriching feature.
-
One Embedder, Any Task: Instruction-Finetuned Text Embeddings. Hongjin Su, Weijia Shi, Jungo Kasai, and et al. <ins>Preprint</ins> 2022. [pdf]; [website].
-
Prompt Consistency for Zero-Shot Task Generalization. Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, and Graham Neubig. <ins>Findings of EMNLP</ins> 2022. [pdf]; [code].
-
Teaching Machine Comprehension with Compositional Explanations. Qinyuan Ye, Xiao Huang, Elizabeth Boschee, and Xiang Ren. <ins>Findings of EMNLP</ins> 2020. [pdf]; [code].
-
Learning from Explanations with Neural Execution Tree. Ziqi Wang, Yujia Qin, Wenxuan Zhou, Jun Yan, Qinyuan Ye, Leonardo Neves, Zhiyuan Liu, and Xiang Ren. <ins>ICLR</ins> 2020. [pdf]; [website].
-
Training Classifiers with Natural Language Explanations. Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher RΓ©. <ins>ACL</ins> 2018. [pdf]; [code].
-
Zero-shot Learning of Classifiers from Natural Language Quantification. Shashank Srivastava, Igor Labutov, and Tom Mitchell. <ins>ACL</ins> 2018. [pdf].
-
Joint Concept Learning and Semantic Parsing from Natural Language Explanations. Shashank Srivastava, Igor Labutov, and Tom Mitchell. <ins>EMNLP</ins> 2017. [pdf].
6.3 General-purpose Language Models
General-purpose language models are also one of the most attractive applications of instruction learning, e.g., ChatGPT, which can align nicely with human values.
-
Sparks of Artificial General Intelligence: Early experiments with GPT-4. SΓ©bastien Bubeck, Varun Chandrasekaran, Ronen Eldan, and et al. <ins>Preprint</ins> 2023. [pdf].
-
GPT-4 Technical Report. OpenAI. <ins>Preprint</ins> 2023. [pdf]; [blog].
-
The Wisdom of Hindsight Makes Language Models Better Instruction Followers. Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, and Joseph E. Gonzalez. <ins>Preprint</ins> 2023. [pdf]; [code].
-
Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models. Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, and Bryan Catanzaro. <ins>Preprint</ins> 2023. [pdf].
-
Training language models to follow instructions with human feedback. Long Ouyang, Jeffrey Wu, Xu Jiang, and et al. <ins>NeurIPS</ins> 2022. [pdf].
6.4 Other Papers
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->-
GPTScore: Evaluate as You Desire. Jinlan Fu, See-Kiong Ng, Zhengbao Jiang, and Pengfei Liu. <ins>Preprint</ins> 2023. [pdf]; [code].
-
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning. Zhiyang Xu, Ying Shen, and Lifu Huang. <ins>Preprint</ins> 2022. [pdf].
-
Task-aware Retrieval with Instructions. Akari Asai, Timo Schick, Patrick Lewis, and et al. <ins>Preprint</ins> 2022. [pdf]; [code].
-
UnifiedABSA: A Unified ABSA Framework Based on Multi-task Instruction Tuning. Zengzhi Wang, Rui Xia, and Jianfei Yu. <ins>Preprint</ins> 2022. [pdf].
-
In-Context Learning for Few-Shot Dialogue State Tracking. Yushi Hu, Chia-Hsuan Lee, Tianbao Xie, Tao Yu, Noah A. Smith, and Mari Ostendorf. <ins>Findings of EMNLP</ins> 2022. [pdf]; [code].
-
Few-shot Learning with Multilingual Language Models. Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, and et al. <ins>EMNLP</ins> 2022. [pdf]; [code].
-
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. Tianbao Xie, Chen Henry Wu, Peng Shi, and et al. <ins>EMNLP</ins> 2022. [pdf]; [code]; [website].
-
In-BoXBART: Get Instructions into Biomedical Multi-Task Learning . Mihir Parmar, Swaroop Mishra, Mirali Purohit, Man Luo, M. Hassan Murad, and Chitta Baral. <ins>Findings of NAACL</ins> 2022. [pdf]; [code].
7. π Extended Reading
We also share some other awesome papers that might inspire the future work.
7.1 Instruction Induction
-
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners. Seonghyeon Ye, Doyoung Kim, Joel Jang, Joongbo Shin, and Minjoon Seo. <ins>Preprint</ins> 2022. [pdf]; [code].
-
Instruction Induction: From Few Examples to Natural Language Task Descriptions. Or Honovich, Uri Shaham, Samuel R. Bowman, and Omer Levy. <ins>Preprint</ins> 2022. [pdf]; [code].
-
Learning to Decompose and Organize Complex Tasks. Yi Zhang, Sujay Kumar Jauhar, Julia Kiseleva, Ryen White, and Dan Roth. <ins>NAACL</ins> 2021. [pdf]; [corpus].
-
Analogous Process Structure Induction for Sub-event Sequence Prediction. Hongming Zhang, Muhao Chen, Haoyu Wang, Yangqiu Song, and Dan Roth. <ins>EMNLP</ins> 2020. [pdf]; [code].
7.2 ChatGPT-related Papers
<!-- **ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks.** *Fabrizio Gilardi, Meysam Alizadeh, and MaΓ«l Kubli.* <ins>Preprint</ins> 2023. [[pdf](https://arxiv.org/pdf/2303.15056.pdf)]; [[other resources](link)]. -->Nowdays, ChatGPT is a super star π in the NLP community. Since there is no official paper for ChatGPT, we share some frontier works that can provide deep insights into ChatGPT.
-
When do you need Chain-of-Thought Prompting for ChatGPT? Jiuhai Chen, Lichang Chen, Heng Huang, and Tianyi Zhou. <ins>Preprint</ins> 2023. [pdf].
-
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models. Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, and Karthik Narasimhan. <ins>Preprint</ins> 2023. [pdf].
-
Is ChatGPT a General-Purpose Natural Language Processing Task Solver? Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, and Diyi Yang. <ins>Preprint</ins> 2023. [pdf].
-
How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection. Biyang Guo, Xin Zhang, Ziyuan Wang, and et al. <ins>Preprint</ins> 2023. [pdf]; [corpus].
-
ChatGPT: Jack of all trades, master of none. Jan KocoΕ, Igor Cichecki, Oliwier Kaszyca, and et al. <ins>Preprint</ins> 2023. [pdf].
-
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective. Jindong Wang, Xixu Hu, Wenxin Hou, and et al. <ins>Preprint</ins> 2023. [pdf]; [code].
7.3 Human Feedback vs. Model Feedback
<!-- **Paper Title.** *Author 1, Author 2, and Author 3.* <ins>Conference/Journal/Preprint</ins> Year. [[pdf](link)]; [[other resources](link)]. -->-
Aligning Large Language Models through Synthetic Feedback. Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, and Minjoon Seo. <ins>Preprint</ins> 2023. [pdf].
-
LIMA: Less Is More for Alignment. Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, and et al. <ins>Preprint</ins> 2023. [pdf].
-
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision. Zhiqing Sun, Yikang Shen, Qinhong Zhou, and et al. <ins>Preprint</ins> 2023. [pdf]; [code].
-
Chain of Hindsight Aligns Language Models with Feedback. Hao Liu, Carmelo Sferrazza, and Pieter Abbeel. <ins>Preprint</ins> 2023. [pdf]; [code].
-
Pretraining Language Models with Human Preferences. Tomasz Korbak, Kejian Shi, Angelica Chen, and et al. <ins>Preprint</ins> 2023. [pdf].
-
Constitutional AI: Harmlessness from AI Feedback. Yuntao Bai, Saurav Kadavath, Sandipan Kundu, and et al. <ins>Preprint</ins> 2022. [pdf]; [corpus].
-
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. Yuntao Bai, Andy Jones, Kamal Ndousse, and et al. <ins>Preprint</ins> 2022. [pdf]; [corpus].
7.4 Scalable Oversight and Alignment
-
Measuring Progress on Scalable Oversight for Large Language Models. Samuel R. Bowman, Jeeyoon Hyun, Ethan Perez, and et al. <ins>Preprint</ins> 2022. [pdf].
-
Aligning AI With Shared Human Values. Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, and Jacob Steinhardt. <ins>ICLR</ins> 2021. [pdf].
7.5 Other Papers
-
Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models. Kaitlyn Zhou, Dan Jurafsky, and Tatsunori Hashimoto. <ins>Preprint</ins> 2023. [pdf].
-
The Capacity for Moral Self-Correction in Large Language Models. Deep Ganguli, Amanda Askell, Nicholas Schiefer, and et al. <ins>Preprint</ins> 2023. [pdf].
-
Large Language Models Can Be Easily Distracted by Irrelevant Context. Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed Chi, Nathanael SchΓ€rli, and Denny Zhou. <ins>Preprint</ins> 2023. [pdf]; [corpus].
-
Language Models (Mostly) Know What They Know. Saurav Kadavath, Tom Conerly, Amanda Askell, and et al. <ins>Preprint</ins> 2022. [pdf].
<!-- omit in toc -->