Awesome
Self-Correction LLMs Papers
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
Our survey paper: Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies. Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang
Content
<table> <tr><td colspan="2"><a href="#training-time-correction">1. Training-Time Correction</a></td></tr> <tr> <td> <a href="#rlhf-strategy">1.1 RLHF Strategy</a></td> <td> <a href="#fine-tuning-strategy">1.2 Fine-tuning Strategy</a></td> <tr> <td> <a href="#self-training-strategy">1.3 Self-Training Strategy</a></td> <tr><td colspan="2"><a href="#generation-time-correction">2. Generation-Time Correction</a></td></tr> <tr> <td> <a href="#re-ranking-strategy">2.1 Re-Ranking Strategy</a></td> <td> <a href="#feedback-guided-strategy">2.2 Feedback-guided Strategy</a></td> </tr> <tr><td colspan="2"><a href="#post-hoc-correction">3. Post-hoc Correction</a></td></tr> <tr> <td> <a href="#self-refine-strategy">3.1 Self-Refine Strategy</a></td> <td> <a href="#external-feedback-strategy">3.2 External Feedback Strategy</a></td> <tr> <td> <a href="#model-debate-strategy">3.3 Model-Debate Strategy</a></td> </tr> </table>Training-Time Correction
RLHF Strategy
-
Training Language Models to Follow Instructions with Human Feedback. Advances in Neural Information Processing Systems (NeurIPS), 2022. paper
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, Ryan Lowe
-
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training. arxiv, 2023. paper
Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, Hannaneh Hajishirzi
-
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. arxiv, 2022. paper
Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, Dario Amodei, Tom Brown, Jack Clark, Sam McCandlish, Chris Olah, Ben Mann, Jared Kaplan
-
Improving Alignment of Dialogue Agents via Targeted Human Judgments. arxiv, 2022. paper
Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu, Rachel Foley, Susannah Young, Iason Gabriel, William Isaac, John Mellor, Demis Hassabis, Koray Kavukcuoglu, Lisa Anne Hendricks, Geoffrey Irving
Fine-tuning Strategy
-
Training Language Models with Language Feedback at Scale. arxiv, 2023. paper
Jérémy Scheurer, Jon Ander Campos, Tomasz Korbak, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez
-
Continually Improving Extractive QA via Human Feedback. arxiv, 2023. paper
Ge Gao, Hung-Ting Chen, Yoav Artzi, Eunsol Choi
-
Chain of Hindsight Aligns Language Models with Feedback. arxiv, 2023. paper
Hao Liu, Carmelo Sferrazza, Pieter Abbeel
-
QUARK: Controllable Text Generation with Reinforced Unlearning. Advances in Neural Information Processing Systems (NeurIPS), 2022. paper
Ximing Lu, Sean Welleck, Jack Hessel, Liwei Jiang, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, Yejin Choi
-
SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization. Annual Meeting of the Association for Computational Linguistics (ACL), 2021. paper
Yixin Liu, Pengfei Liu
-
BERTTune: Fine-Tuning Neural Machine Translation with BERTScore. Annual Meeting of the Association for Computational Linguistics (ACL), 2021. paper
Inigo Jauregi Unanue, Jacob Parnell, Massimo Piccardi
Self-Training Strategy
-
STaR: Bootstrapping Reasoning With Reasoning. Advances in Neural Information Processing Systems (NeurIPS), 2022. paper
Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah Goodman
-
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions. Annual Meeting of the Association for Computational Linguistics (ACL), 2023. paper
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
-
Constitutional AI: Harmlessness from AI Feedback. arxiv, 2022. paper
Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan
-
Language Model Self-improvement by Reinforcement Learning Contemplation. arxiv, 2023. paper
Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu
-
Large Language Models Can Self-Improve. arxiv, 2022. paper
Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han
-
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback. arxiv, 2023. paper
Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto
Generation-Time Correction
Re-Ranking Strategy
-
Large Language Models are Better Reasoners with Self-Verification. arXiv, 2023. paper
Yixuan Weng, Minjun Zhu, Fei Xia, Bin Li, Shizhu He, Kang Liu, Jun Zhao
-
CodeT: Code Generation with Generated Tests. International Conference on Learning Representations (ICLR), 2023. paper
Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen
-
LEVER: Learning to Verify Language-to-Code Generation with Execution. International Conference on Machine Learning (ICML), 2023. paper
Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin
-
Rethinking with Retrieval: Faithful Large Language Model Inference. arxiv, 2022. paper
Hangfeng He, Hongming Zhang, Dan Roth
-
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback. arxiv, 2023. paper
Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei Li
-
High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural Metric. Transactions of the Association for Computational Linguistics (TACL), 2022. paper
Markus Freitag, David Grangier, Qijun Tan, Bowen Liang
-
Making Language Models Better Reasoners with Step-Aware Verifier. Annual Meeting of the Association for Computational Linguistics (ACL), 2023. paper
Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen
Feedback-guided Strategy
-
Let's Verify Step by Step. arxiv, 2023. paper
Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe
-
Diffusion-LM Improves Controllable Text Generation. Advances in Neural Information Processing Systems (NeurIPS), 2022. paper
Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
-
FUDGE: Controlled Text Generation With Future Discriminators. North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2021. paper
Kevin Yang, Dan Klein
-
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022. paper
Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
-
Generating Natural Language Proofs with Verifier-Guided Search. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022. paper
Kaiyu Yang, Jia Deng, Danqi Chen
-
Discriminator-Guided Multi-step Reasoning with Language Models. arxiv, 2023. paper
Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu Wang
-
Solving Math Word Problems via Cooperative Reasoning Induced Language Models. Annual Meeting of the Association for Computational Linguistics (ACL), 2023. paper
Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Yongfeng Huang, Ruyi Gan, Jiaxing Zhang, Yujiu Yang
-
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022. paper
Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Yejin Choi
-
Faithful Reasoning Using Large Language Models. arxiv, 2022. paper
Antonia Creswell, Murray Shanahan
-
Reasoning with Language Model is Planning with World Model. arxiv, 2023. paper
Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, Daisy Zhe Wang, Zhiting Hu
-
Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding. arxiv, 2023. paper
Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie
-
Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arxiv, 2023. paper
Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Post-hoc Correction
Self-Refine Strategy
-
Self-Refine: Iterative Refinement with Self-Feedback. arxiv, 2023. paper
Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark
-
Self-Verification Improves Few-Shot Clinical Information Extraction. arxiv,2023. paper
Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon
-
Reflexion: Language Agents with Verbal Reinforcement Learning. arxiv, 2023. paper
Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao
-
Iterative Translation Refinement with Large Language Models. arxiv, 2023. paper
Pinzhen Chen, Zhicheng Guo, Barry Haddow, Kenneth Heafield
-
Leveraging GPT-4 for Automatic Translation Post-Editing. arxiv, 2023. paper
Vikas Raunak, Amr Sharaf, Hany Hassan Awadallah, Arul Menezes
-
Language Models can Solve Computer Tasks. arxiv, 2023. paper
Geunwoo Kim, Pierre Baldi, Stephen McAleer
-
SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation. Blog Post, 2023. website
Seonghyeon Ye, Yongrae Jo, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Minjoon Seo
-
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models. arxiv, 2023. paper
Potsawee Manakul, Adian Liusie, Mark J. F. Gales
-
CLOVA: A Closed-LOop Visual Assistant with Tool Usage and Update arxiv, 2023 paper
Zhi Gao, Yuntao Du, Xintong Zhang, Xiaojian Ma, Wenjuan Han, Song-Chun Zhu, Qing Li
External Feedback Strategy
-
Re3: Generating Longer Stories With Recursive Reprompting and Revision. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022. paper
Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
-
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning. Advances in Neural Information Processing Systems (NeurIPS), 2022. paper
Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C.H. Hoi
-
REFINER: Reasoning Feedback on Intermediate Representations. arxiv, 2023. paper
Debjit Paul, Mete Ismayilzada, Maxime Peyrard, Beatriz Borges, Antoine Bosselut, Robert West, Boi Faltings
-
RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. Annual Meeting of the Association for Computational Linguistics (ACL), 2023. paper
Afra Feyza Akyurek, Ekin Akyurek, Ashwin Kalyan, Peter Clark, Derry Tanti Wijaya, Niket Tandon
-
Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing. Annual Meeting of the Association for Computational Linguistics (ACL), 2023. paper
Hao Yan, Saurabh Srivastava, Yintao Tai, Sida I. Wang, Wen-tau Yih, Ziyu Yao
-
Baldur: Whole-Proof Generation and Repair with Large Language Models. arxiv, 2023. paper
Emily First, Markus N. Rabe, Talia Ringer, Yuriy Brun
-
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. arxiv, 2023. paper
Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen
-
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios. arxiv, 2023. paper
I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu
-
RARR: Researching and Revising What Language Models Say, Using Language Models. Annual Meeting of the Association for Computational Linguistics (ACL), 2023. paper
Luyu Gao, Zhuyun Dai, Panupong Pasupat, Anthony Chen, Arun Tejasvi Chaganty, Yicheng Fan, Vincent Y. Zhao, Ni Lao, Hongrae Lee, Da-Cheng Juan, Kelvin Guu
-
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback. arxiv, 2023. paper
Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao
-
Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models. arxiv, 2023. paper
Miaoran Li, Baolin Peng, Zhu Zhang
-
Improving Language Models via Plug-and-Play Retrieval Feedback. arxiv, 2023. paper
Wenhao Yu, Zhihan Zhang, Zhenwen Liang, Meng Jiang, Ashish Sabharwal
-
Demystifying GPT Self-Repair for Code Generation. arxiv, 2023. paper
Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama
-
Self-Edit: Fault-Aware Code Editor for Code Generation. arxiv, 2023. paper
Kechi Zhang, Zhuo Li, Jia Li, Ge Li, Zhi Jin
-
Teaching Large Language Models to Self-Debug. arxiv, 2023. paper
Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
-
SelfEvolve: A Code Evolution Framework via Large Language Models. arxiv, 2023. paper
Shuyang Jiang, Yuhao Wang, Yu Wang
-
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning. arxiv, 2023. paper
Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
-
Self-Critiquing Models for Assisting Human Evaluators. arxiv, 2022. paper
William Saunders, Catherine Yeh, Jeff Wu, Steven Bills, Long Ouyang, Jonathan Ward, Jan Leike
-
ALGO: Synthesizing Algorithmic Programs with Generated Oracle Verifiers. arxiv, 2023. paper
Kexun Zhang, Danqing Wang, Jingtao Xia, William Yang Wang, Lei Li
-
A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification. arxiv, 2023. paper
Yiannis Charalambous, Norbert Tihanyi, Ridhi Jain, Youcheng Sun, Mohamed Amine Ferrag, Lucas C. Cordeiro
-
Generating Sequences by Learning to Self-Correct. International Conference on Learning Representations (ICLR), 2023. paper
Sean Welleck, Ximing Lu, Peter West, Faeze Brahman, Tianxiao Shen, Daniel Khashabi, Yejin Choi
-
MAF: Multi-Aspect Feedback for Improving Reasoning in Large Language Models. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023. paper
Deepak Nathani, David Wang, Liangming Pan, William Yang Wang
-
CLOVA: A Closed-LOop Visual Assistant with Tool Usage and Update arxiv, 2023 paper
Zhi Gao, Yuntao Du, Xintong Zhang, Xiaojian Ma, Wenjuan Han, Song-Chun Zhu, Qing Li
-
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. paper
Thomas Palmeira Ferraz, Kartik Mehta, Yu-Hsiang Lin, Haw-Shiuan Chang, Shereen Oraby, Sijia Liu, Vivek Subramanian, Tagyoung Chung, Mohit Bansal, Nanyun Peng
Model-Debate Strategy
-
Improving Factuality and Reasoning in Language Models through Multiagent Debate. arxiv, 2023. paper
Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
-
LM vs LM: Detecting Factual Errors via Cross Examination. arxiv, 2023. paper
Roi Cohen, May Hamri, Mor Geva, Amir Globerson
-
Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback. arxiv, 2023. paper
Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
-
PRD: Peer Rank and Discussion Improve Large Language Model-based Evaluations. arxiv, 2023. paper
Ruosen Li, Teerth Patel, Xinya Du
Contribution
Contributors
<!-- <a href="https://github.com/teacherpeterpan/self-correction-llm-papers/graphs/contributors"> <img src="https://contrib.rocks/image?repo=teacherpeterpan/self-correction-llm-papers" /> </a> --> <!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section --> <!-- prettier-ignore-start --> <!-- markdownlint-disable --> <table> <tbody> <tr> <td align="center" valign="top" width="120px"> <a href="http://www.liangmingpan.com/"> <img src="https://github.com/teacherpeterpan.png" width="80px;" alt="Liangming Pan"/> <br /> <sub><b>Liangming Pan</b></sub> </a> </td> <td align="center" valign="top" width="120px"> <a href="https://xinyuanlu00.github.io/"> <img src="https://github.com/XinyuanLu00.png" width="80px;" alt="Xinyuan Lu"/> <br /> <sub><b>Xinyuan Lu</b></sub> </a> </td> </tr> </tbody> </table>Acknowledgement
- There are cases where we miss important works in this field, please contribute to this repo! Thanks for your efforts in advance.
- If you encounter any problem, please either directly contact Liangming Pan or leave an issue in the GitHub repo.