Awesome

MARVEL_AVR

Github repo for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning <br> Website link: MARVEL <br> Huggingface link: MARVEL

TL;DR

A new comprehensive benchmark, MARVEL, evaluates multi-modal large language models' abstract reasoning abilities in six patterns across five different task configurations, revealing significant performance gaps between human and SOTA MLLMs.

Folder

Json_data folder: sub-folder for each puzzle image and corresponding label json file.

Panel_data folder: images of each panel in each puzzle.

Marvel folder: images of 770 puzzles composed of six core knowledge patterns, geometric and abstract shapes, and five different task configurations

Marvel

marvel_label.json

id: image id
pattern: general concept pattern
task_configuration: the task configuration of the puzzle
avr_question: abstract visual reasoning question
answer
explanation: explanation of the pattern within the puzzle
f_perception_question: fine-grained perception question annotated by human
f_perception_answer
f_perception_distractor
c_perception_question_tuple: coarse-grained perception question based on panels of image

(
    "coarse-grained perception question on the puzzle's context part",
    "coarse-grained perception question on the puzzle's choices part",
    "coarse-grained perception question on the whole puzzle"
)

c_perception_answer_tuple:

(
    context part panel number,
    choices part panel number,
    whole puzzle panel number,
)

If you find MARVEL useful for your work please cite:

@article{jiang2024marvel,
  title={MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning},
  author={Jiang, Yifan and Zhang, Jiarui and Sun, Kexuan and Sourati, Zhivar and Ahrabian, Kian and Ma, Kaixin and Ilievski, Filip and Pujara, Jay},
  journal={arXiv preprint arXiv:2404.13591},
  year={2024}
}