Awesome
<h1> <img src="plexiglass/assets/plexiglass_safellama.png" width="100" height="100"><br> Plexiglass</h1> <!-- <p align="center"> -->Quickstart | Installation | Documentation | Code of Conduct
<a href="https://badge.fury.io/py/plexiglass"><img src="https://badge.fury.io/py/plexiglass.svg" alt="PyPI version" height="18"></a> <img alt="GitHub License" src="https://img.shields.io/github/license/safellama/plexiglass"> <img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/plexiglass">
</p>Plexiglass is a toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
It is a simple command line interface (CLI) tool which allows users to quickly test LLMs against adversarial attacks such as prompt injection, jailbreaking and more.
Plexiglass also allows security, bias and toxicity benchmarking of multiple LLMs by scraping latest adversarial prompts such as jailbreakchat.com and wiki_toxic. See more at modes.
Quickstart
Please follow this quickstart guide in the documentation.
Installation
The first experimental release is version 0.0.1
.
To download the package from PyPi:
pip install --upgrade plexiglass
Modes
Plexiglass has two modes: llm-chat
and llm-scan
.
llm-chat
allows you to converse with the LLM and measure predefined metrics, such as toxicity, from its responses. It currently supports the following metrics:
toxicity
pii_detection
llm-scan
runs benchmarks using open-source datasets to identify and assess various vulnerabilities in the LLM.
Feature Request
To request new features, please submit an issue
Development Roadmap
- implement adversarial prompt templates in
llm-chat
mode - security, bias and toxicity benchmarking with
llm-scan
mode - generate html report in
llm-scan
andllm-chat
modes - standalone python module
- production-ready API
Join us in #plexiglass on Discord.
Contributors
<!-- Copy-paste in your Readme.md file --> <a href="https://github.com/kortex-labs/plexiglass/graphs/contributors"> <img src="https://contrib.rocks/image?repo=kortex-labs/plexiglass" /> </a>Code of Conduct
Read our Code of Conduct.
Made with contrib.rocks.