Home

Awesome

ReliableLM4Code

This repository extends from our recent work, "Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey" and "Large language models for software engineering: A systematic literature review". It includes necessary information for our research and a curated collection of LM4Code papers and other resources (datasets, tutorials, etc.). The focus is primarily on papers that use pre-trained models, especially large language models, to improve the reliability of language models in Software Engineering research.

For more details, please access this site

Modern language models (LMs) have been successfully employed in source code generation and understanding, leading to a significant increase in research focused on learning-based code intelligence, such as automated bug repair, and test case generation. Despite their great potential, language models for code intelligence (LM4Code) are susceptible to potential pitfalls, which hinder realistic performance and further impact their reliability and applicability in real-world deployment. Such challenges drive the need for a comprehensive understanding - not just identifying these issues but delving into their possible implications and existing solutions to build more reliable language models tailored to code intelligence. Based on a well-defined systematic research approach, we conducted an extensive literature review to uncover the pitfalls inherent in LM4Code. Finally, 67 primary studies from top-tier venues have been identified. After carefully examining these studies, we designed a taxonomy of pitfalls in LM4Code research and conducted a systematic study to summarize the issues, implications, current solutions, and challenges of different pitfalls for LM4Code systems. We developed a comprehensive classification scheme that dissects pitfalls across four crucial aspects: data collection and labeling, system design and learning, performance evaluation, and deployment and maintenance. Through this study, we aim to provide a roadmap for researchers and practitioners, facilitating their understanding and utilization of LM4Code in reliable and trustworthy ways.

Please feel free to send a pull request to add papers and relevant content that are not listed here. We uploaded our completed paper lists to Google Drive with detailed reviewed information.

Content

Papers

Data Collection and Labeling

Unbalanced Distribution

Label Errors

Data Noise

System Design and Learning

Data Snooping

Spurious Correlations

Inappropriate Model Design

Performance Evaluation

Inappropriate Baseline

Inappropriate Evaluation Dataset

Low Reproducibility

Inappropriate Performance Measures

Deployment and Maintainance

Real-World Constraints

Attack Threats

Security Concerns in Generated Code

Language Models for Code Intelligence

Decoder-only Models

GPT-1

GPT-2

GPT-3

Codex

GPT-NeoX

GPT-Neo

CodeGen

InstructGPT

CodeGeeX

GPT-J

LLaMA

ChatGPT

StableLM-Alpha

InCoder

GPT-4

WizardCoder

PanGu-Coder

OPT

StarCoder

SantaCoder

PaLM

Vicuna

Flan-UL2

CPM-Bee

MT-NLG

GLM

YaLM

Alpaca

RWKV-4

Sparrow

Falcon

Code Llama

RedPajama-INCITE

DeciCoder-1B

OpenLLaMA

CodeGPT

Encoder-only Models

BERT

ALBERT

RoBERTa

CodeBERT

GraphCodeBERT

Encoder-decoder Models

AlphaCode

T5

CodeT5

CodeT5+

UnixCoder

PLBART

CodeReviewer

Relevant Surveys on LM4Code

General Surveys on AI4SE

General Surveys on LLM

Repositories and Resources for LM4Code

Repositories and Resources for LLM

Benchmarks

Bug Repair

Defects4J

ManyBugs/IntroClass

BugAID

CoCoNut

QuixBugs

Bugs.jar

BugsInPy

DeepFix

Code Generation/Synthesis

CONCODE

HumanEval

MBPP/MathQA-Python

Code Sumarization

CODE-NN

TL-CodeSum

CodeSearchNet

Cites

If you find this repository useful, please cite our survey paper:

@article{she2023pitfalls,
  title={Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey},
  author={She, Xinyu and Liu, Yue and Zhao, Yanjie and He, Yiling and Li, Li and Tantithamthavorn, Chakkrit and Qin, Zhan and Wang, Haoyu},
  journal={arXiv preprint arXiv:2310.17903},
  year={2023}
}

@article{hou2023large,
  title={Large language models for software engineering: A systematic literature review},
  author={Hou, Xinyi and Zhao, Yanjie and Liu, Yue and Yang, Zhou and Wang, Kailong and Li, Li and Luo, Xiapu and Lo, David and Grundy, John and Wang, Haoyu},
  journal={arXiv preprint arXiv:2308.10620},
  year={2023}
}