Home

Awesome

ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness

<img src="./assets/ReCEvalOverview.png" alt="teaser image" width="7500"/>

Dependencies

This code is written using PyTorch and HuggingFace's Transformer repo. Running ReCEval requires access to GPUs. The evaluation is quite light-weight, so one GPU should suffice. Please install Entailment Bank and GSM-8K datasets separately. For using human judgements datasets for GSM-8K and running baselines please follow the setup procedure in ROSCOE (preferably in a separate environment).

Installation

The simplest way to run our code is to start with a fresh environment.

conda create -n ReCEval python=3.9
source activate ReCEval
pip install -r requirements.txt

Running Evaluation

Reference

Please cite our paper if you use our repository in your works:


@article{Prasad2023ReCEval,
  title         = {ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness},
  author        = {Archiki Prasad and Swarnadeep Saha and Xiang Zhou and Mohit Bansal},
  year          = {2023},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CL},
  eprint        = {2304.10703}
}