Home

Awesome

LLM Security and Privacy

A curated list of papers and tools covering LLM threats and vulnerabilities, both from a security and privacy standpoint. Summaries, key takeaway points, and additional details for each paper are found in the paper-summaries folder.

main.bib file contains the latest citations of the papers listed here.

<p align="center"> <img src="./images/taxonomy.png" alt="A taxonomy of security and privacy threats against deep learning models and consecutively LLMs" style="width:100%"> <b>Overview Figure:</b> A taxonomy of current security and privacy threats against deep learning models and consecutively Large Language Models (LLMs). </p>

Table of Contents

Papers

No.Paper TitleVenueYearCategoryCodeSummary
1.InjectAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agentspre-print2024Prompt InjectionN/ATBD
2.LLM Agents can Autonomously Hack Websitespre-print2024ApplicationsN/ATBD
3.An Overview of Catastrophic AI Riskspre-print2023GeneralN/ATBD
4.Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilitiespre-print2023GeneralN/ATBD
5.LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?pre-print2023GeneralN/ATBD
6.Beyond the Safeguards: Exploring the Security Risks of ChatGPTpre-print2023GeneralN/ATBD
7.Prompt Injection attack against LLM-integrated Applicationspre-print2023Prompt InjectionN/ATBD
8.Identifying and Mitigating the Security Risks of Generative AIpre-print2023GeneralN/ATBD
9.PassGPT: Password Modeling and (Guided) Generation with Large Language ModelsESORICS2023ApplicationsCodeTBD
10.Harnessing GPT-4 for generation of cybersecurity GRC policies: A focus on ransomware attack mitigationComputers & Security2023ApplicationsN/ATBD
11.Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injectionpre-print2023Prompt InjectionCodeTBD
12.Examining Zero-Shot Vulnerability Repair with Large Language ModelsIEEE S&P2023ApplicationsN/ATBD
13.LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Pluginspre-print2023GeneralN/ATBD
14.Chain-of-Verification Reduces Hallucination in Large Language Modelspre-print2023HallucinationsN/ATBD
15.Pop Quiz! Can a Large Language Model Help With Reverse Engineering?pre-print2022ApplicationsN/ATBD
16.Extracting Training Data from Large Language ModelsUsenix Security2021Data ExtractionCodeTBD
17.Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applicationspre-print2024Prompt-InjectionCodeTBD
18.CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive SummarizationEMNLP2021HallucinationsCodeTBD

Frameworks & Taxonomies

Tools

News Articles, Blog Posts, and Talks

Contributing

If you are interested in contributing to this repository, please see CONTRIBUTING.md for details on the guidelines.

A list of current contributors is found HERE.

Contact

For any questions regarding this repository and/or potential (research) collaborations please contact Briland Hitaj.