Home

Awesome

<h1 align="center"> <a href="https://duckai.org"><img src="https://raw.githubusercontent.com/TheDuckAI/duck_ai_website/main/public/static/images/twitter-card.png" alt="duckai logo" width="150"></a> <br/> Advanced Reasoning Benchmark </br> </h1> <p align="center"> <a href="https://arxiv.org/abs/2307.13692"><img src="https://img.shields.io/badge/arXiv-2307.13692-red.svg" alt="arXiv"></a> <img src="https://github.com/TheDuckAI/arb/actions/workflows/lint.yml/badge.svg" alt="Lint Status"> <img src="https://img.shields.io/badge/license-MIT-blue?style=flat-square"> </p> <h5 align="center">A <a href="https://duckai.org/" target="_blank">DuckAI</a> project in collaboration with the Georgia Institute of Technology, ETH Zürich, Nomos AI, Stanford University Center for Legal Informatics, and the Mila - Quebec AI Institute</h4>

Abstract

ARB is a novel benchmark dataset composed of advanced reasoning problems designed to evaluate LLMs on text comprehension and expert domain reasoning, presenting a more challenging test than prior benchmarks, featuring questions that test deeper knowledge of mathematics, physics, biology, chemistry, and law.

API Usage

Endpoint url: https://advanced-reasoning-benchmark.netlify.app/api/ <br/> The documentation for the complete REST API of the ARB dataset is here.

Copyright © 2023 DuckAI