Home

Awesome

<p align="center"> <img src="assets/scieval.jpeg" style="width: 50%;" id="title-icon"> </p> <p align="center"> 🌐 <a href="https://bai-scieval.duiopen.com/#/" target="_blank">Website</a> • 🤗 <a href="https://huggingface.co/datasets/OpenDFM/BAI-SciEval" target="_blank">Hugging Face</a> </p>

Description

SciEval is an evaluation benchmark for large language models in the scientific domain. It consists of approximately 18,000 objective evaluation questions and few subjective questions, covering the fundamental scientific fields of chemistry, physics, and biology. This benchmark assesses the understanding and generation capabilities of large language models in scientific content from four aspects: basic knowledge, knowledge application, scientific calculation, and research ability.

Files Description

[{
    "id": "5534a4ef45aea8a6f1835750a54c01d0",
    "pred": "C",
}]

Reference

If you use any source codes or datasets included in this repository in your work, please cite the corresponding papers. The bibtex are listed below:

@article{sun2023scieval,
  title={SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research},
  author={Sun, Liangtai and Han, Yang and Zhao, Zihan and Ma, Da and Shen, Zhennan and Chen, Baocai and Chen, Lu and Yu, Kai},
  journal={arXiv preprint arXiv:2308.13149},
  year={2023}
}