Home

Awesome

CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation

<div align="center"> <a href="https://haitianliu22.github.io/code-scope-benchmark/"><img src="./images/leaderboard.png">Leaderboard</a> &nbsp;&nbsp;|&nbsp;&nbsp; <a href="https://arxiv.org/abs/2311.08588">πŸ“„ Paper</a> &nbsp;&nbsp;|&nbsp;&nbsp; <a href="https://huggingface.co/datasets/WeixiangYan/CodeScope">πŸ€— Access from HuggingFace datasets</a> &nbsp;&nbsp;|&nbsp;&nbsp; <a href="https://drive.google.com/file/d/1kg3KICQZekpaQyCAGt_qTPR6ag5MBT_y/view?usp=sharing"><img src="./images/google_drive.png"> Access from Google Drive datasets</a> </div> <br>

CodeScope, an execution-based, multilingual, multi-task, multi-dimensional evaluation benchmark for comprehensively gauging LLM capabilities on coding tasks. CodeScope covers 43 programming languages and 8 coding tasks. It evaluates the coding performance of LLMs from three dimensions (perspectives): difficulty, efficiency, and length.

🌈 Update

Datasets

πŸ€—Hugging Face or <img src="./images/google_drive.png">Google Drive or Github Data

Code

CodeScope evaluates the comprehensive ability of LLMs in code understanding and code generation from eight coding tasks.

Code Understanding

  1. Code Summarization
  2. Code Smell
  3. Code Review
  4. Automated Testing

Code Generation

  1. Program Synthesis
  2. Code Translation
  3. Code Repair
  4. Code Optimization

Citation

Please cite the paper if you use the data or code from CodeScope.

@misc{yan2023codescope,
      title={CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation},
      author={Weixiang Yan and Haitian Liu and Yunkun Wang and Yunzhe Li and Qian Chen and Wen Wang and Tingyu Lin and Weishan Zhao and Li Zhu and Shuiguang Deng and Hari Sundaram},
      year={2023},
      eprint={2311.08588},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact

For questions, please feel free to reach out via email at weixiangyan@ucsb.edu.