Home

Awesome

JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models

Introduction

Welcome to JailbreakZoo, a dedicated repository focused on the jailbreaking of large models (LMs), encompassing both large language models (LLMs) and vision language models (VLMs). This project aims to explore the vulnerabilities, exploit methods, and defense mechanisms associated with these advanced AI models. Our goal is to foster a deeper understanding and awareness of the security aspects surrounding large-scale AI systems.

Our website can be found in here

Our paper can be found in here

Timeline

This repository is systematically organized according to the publication timeline.

:fire::fire::fire: <span style="font-size:xx-large;">The latest update being September 01, 2024</span> :fire::fire::fire:

Contents

Contributing

We welcome contributions from the community! Whether you're interested in adding new research, improving existing documentation, or sharing your own jailbreak or defense strategies, your insights are valuable to us. Please check our Contribution Guidelines for more information on how you can get involved.

License and Citation

This project is available under the MIT License. Please refer to our citation guidelines if you wish to reference our work in your research or publications.

Thank you for visiting JailbreakZoo. We hope this repository serves as a valuable resource in your exploration of large model security.

Acknowledgement

Special thanks to our notable contributors: Haibo Jin, Leyang Hu, Xinuo Li, Peiyan Zhang, Chonghan Chen, Jun Zhuang, and Haohan Wang.

*The ranking is in partial order.

Reference

@article{jin2024jailbreakzoo,
  title={JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models},
  author={Jin, Haibo and Hu, Leyang and Li, Xinuo and Zhang, Peiyan and Chen, Chonghan and Zhuang, Jun and Wang, Haohan},
  journal={arXiv preprint arXiv:2407.01599},
  year={2024}
}