Home

Awesome

Awesome-MLLM-Safety Awesome

A collection (won't be updated) of papers related to safety of Multimodal Large Language Models (MLLMs).

We follow the definition of safety from the paper <b><q>Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions</q></b>: <blockquote>Safety is defined as stopping models from following malicious instructions and <b>generating toxic content</b>.</blockquote>

<details> <summary>The scope of our collection.</summary> <ul> <li> Robustness-related wrong prediction and downstream applications (e.g., robotic/medical/legal/financial domains, anomalies detection, fake news detection) are not involved. </li> <li> We care about the safety of <b>MLLMs</b>, excluding other models like text-to-image models. </li> <li> We mainly focus on <b>images and text</b>, and few about other modalities like audio and videos. </li> </ul> </details>

If you find some important work missed, it would be super helpful to let me know (isXinLiu@gmail.com). Thanks!

If you find our survey useful for your research, please consider citing:

@article{liu:arxiv2024,
  title={Safety of Multimodal Large Language Models on Images and Text},
  author={Liu, Xin and Zhu, Yichen and Lan, Yunshi and Yang, Chao and Qiao, Yu},
  journal={arXiv preprint arXiv:2402.00357},
  year={2024}
}

Common terminologies related to safety: <img src='./assets/terminology.jpeg' width='100%'> Taxonomy----safety of MLLMs on images and text: <img src='./assets/taxonomy.jpg' width='100%'>

Table of Contents


Evaluation

Attack

Defense

Other