Home

Awesome

DSBench

project page arxiv

This repo provides the source code of our paper: DSBench: How Far are Data Science Agents from Becoming Data Science Experts? [PDF][Twitter] If you discuss or use DSBench in your research, please cite us!

@misc{jing2024dsbenchfardatascience,
      title={DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?}, 
      author={Liqiang Jing and Zhehui Huang and Xiaoyang Wang and Wenlin Yao and Wenhao Yu and Kaixin Ma and Hongming Zhang and Xinya Du and Dong Yu},
      year={2024},
      eprint={2409.07703},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2409.07703}, 
}

Overview

DSBench is a benchmark for evaluating data science agents with realistic data analysis and data modeling tasks collected from modeloff and kaggle. Given a task instruction (may contain image and table) and data files, a data science agent is tasked with generating a solution that resolves the described task.

<p align="center"> <img src="figures/overview.svg"> </p>

Set Up

For evaluation, you should install the Python packages in the requirments.txt file.

Usage

  1. Clone this repo.
  2. Install all the requirments in Set Up.
  3. For evaluation on data analysis task, refer to ./data_analysis/readme.md.
  4. For evaluation on data modeling task, refer to ./data_modeling/readme.md.

Results

<p align="center"> <img src="figures/result1.svg"> </p> <p align="center"> <img src="figures/result2.svg" width="600"> </p>

Disclaimer

The dataset provided is intended solely for educational and research purposes, with the goal of fostering research in related areas. Users of this dataset are required to adhere to the following guidelines: