Home

Awesome

SHIELD<img src="logo/logo.png" alt="Logo" width="30" height="30">: An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models

<p align="center"> <a href="https://img.shields.io/badge/version-v0.1.0-blue"> <img alt="version" src="https://img.shields.io/badge/version-v0.1.0-blue?color=FF8000?color=009922" /> </a> <a > <img alt="Status-building" src="https://img.shields.io/badge/Status-building-blue" /> </a> <a > <img alt="PRs-Welcome" src="https://img.shields.io/badge/PRs-Welcome-red" /> </a> <br /> </p> <p align="center"> <img src="logo/logo.png" style="width: 200px" align=center> </p>

Overview

In this paper, we conduct a detailed comparison of two models: Google's Gemini <img src="logo/Gemini.png" alt="Gemini" width="30" height="30"> and OpenAI's GPT-4V(ision) <img src="logo/GPT-4V.png" alt="GPT-4V" width="30" height="30">. We utilize Zero-Shot/One-Shot as well as COT methods to comprehensively analyze the performance of these two models in FAS and Face Forgery Detection tasks.Meanwhile, we introduce a novel MCOT method, which has been empirically validated to significantly enhance detection accuracy.

Release

🔗 Citation

If you find our work helpful, please cite:

@article{shi2024shield,
  title={SHIELD: An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models},
  author={Shi, Yichen and Gao, Yuhao and Lai, Yingxin and Wang, Hongyang and Feng, Jun and He, Lei and Wan, Jun and Chen, Changsheng and Yu, Zitong and Cao, Xiaochun},
  journal={arXiv preprint arXiv:2402.04178},
  year={2024}
}