Home

Awesome

CodeTransOcean: A Comprehensive Multilingual Benchmark for Code Translation

<div align="center"> <a href="https://yuchen814.github.io/CodeTransOcean/"><img src="./images/leaderboard6.png" alt="Leaderboard">Leaderboard</a> &nbsp;&nbsp;|&nbsp;&nbsp; <a href="https://arxiv.org/pdf/2310.04951.pdf">📄 Paper</a> &nbsp;&nbsp;|&nbsp;&nbsp; <a href="https://huggingface.co/datasets/WeixiangYan/CodeTransOcean">🤗 Access from HuggingFace datasets</a> &nbsp;&nbsp;|&nbsp;&nbsp; <a href="https://drive.google.com/file/d/1xw6Edqf_nknKoei_LC49n4EtvNQezKGe/view?usp=sharing"><img src="./images/Google_Drive_Logo_16px.png" alt="Google Drive"> Access from Google Drive datasets</a> </div> <br>

CodeTransOcean, a large-scale comprehensive benchmark that supports the largest variety of programming languages for code translation. CodeTransOcean consists of three novel multilingual datasets, namely, MultilingualTrans supporting translations between multiple popular programming languages, NicheTrans for translating between niche programming languages and popular ones, and LLMTrans for evaluating executability of translated code by large language models (LLMs). CodeTransOcean also includes a novel cross-framework dataset, DLTrans, for translating deep learning code across different frameworks.

<div align="center"> <img src="./images/codetransocean.png"> </div>

Datasets

🤗Hugging Face or <img src="./images/Google_Drive_Logo_16px.png">Google Drive

Code

The MultilingualTrans, NicheTrans, and DLTrans datasets were experimented with on CodeT5+, and the code is in the CodeT5+ file.

The LLMTrans dataset was experimented with on GPT-3.5, and the code is in the ChatGPT file.

Citation

Please cite the paper if you use the data or code from CodeTransOcean.

@article{yan2023codetransocean,
  title={CodeTransOcean: A Comprehensive Multilingual Benchmark for Code Translation},
  author={Yan, Weixiang and Tian, Yuchen and Li, Yunzhe and Chen, Qian and Wang, Wen},
  journal={arXiv preprint arXiv:2310.04951},
  year={2023}
}

Contact

For questions, please feel free to reach out via email at yanweixiang.ywx@gmail.com.