Home

Awesome

<div align=center> <img src="https://github.com/3DAgentWorld/Toolkit-for-Prompt-Compression/blob/main/imgs/logo_trans.png" width="600" height="150">

Github stars GitHub license Paper Demo


</div>

PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models

<div align=center>

PCToolkit

</div>

🎉News

đź“„Techical Report

You can find more details about PCToolkit in our <a href='https://arxiv.org/abs/2403.17411'>technical report</a>.

Contents

Introduction

Prompt compression is an innovative method for efficiently condensing input prompts while preserving essential information. To facilitate quick-start services, user-friendly interfaces, and compatibility with common datasets and metrics, we present the Prompt Compression Toolkit (PCToolkit). This toolkit is a unified plug-and-play solution for compressing prompts in Large Language Models (LLMs), featuring cutting-edge prompt compressors, diverse datasets, and metrics for comprehensive performance evaluation. PCToolkit boasts a modular design, allowing for easy integration of new datasets and metrics through portable and user-friendly interfaces. In this paper, we outline the key components and functionalities of PCToolkit.

We conducted evaluations of the compressors within PCToolkit across various natural language tasks, including reconstruction, summarization, mathematical problem-solving, question answering, few-shot learning, synthetic tasks, code completion, boolean expressions, multiple choice questions, and lies recognition.

PCToolkit contains:

<div align=center> <img src="https://github.com/3DAgentWorld/Toolkit-for-Prompt-Compression/blob/main/imgs/architecture.png" width="739" height="380.5"> </div>

Relevant Repositories

Key Features of PCToolkit

(i) State-of-the-art and reproducible methods. Encompassing a wide array of mainstream compression techniques, PCToolkit offers a unified interface for various compression methods (compressors). Notably, PCToolkit incorporates a total of five distinct compressors, namely <a href='https://arxiv.org/abs/2310.06201'>Selective Context</a>, <a href='https://arxiv.org/abs/2310.05736'>LLMLingua</a>, <a href='https://arxiv.org/abs/2310.06839'>LongLLMLingua</a>, <a href='https://arxiv.org/abs/2205.08221'>SCRL</a> and <a href='https://arxiv.org/abs/2107.03444'>Keep it Simple</a>.

(ii) User-friendly interfaces for new compressors, datasets, and metrics. Facilitating portability and ease of adaptation to different environments, the interfaces within PCToolkit are designed to be easily customizable. This flexibility makes PCToolkit suitable for a wide range of environments and tasks.

(iii) Modular design. Featuring a modular structure that simplifies the transition between different methods, datasets, and metrics, PCToolkit is organized into four distinct modules: Compressor, Dataset, Metric and Runner module.

Outlines

The following table presents an overview of the supported tasks, compressors, and datasets within PCToolkit. Each component are described in detail in our technical report.

TasksSupported CompressorsSupported Datasets
ReconstructionSC, LLMLingua, LongLLMLingua, SCRL, KiSBBC, ShareGPT, Arxiv, GSM8K
Mathematical problemsSC, LLMLingua, LongLLMLingua, SCRL, KiSGSM8K, BBH
Boolean expressionsSC, LLMLingua, LongLLMLingua, SCRL, KiSBBH
Multiple choiceSC, LLMLingua, LongLLMLingua, SCRL, KiSBBH
Lies recognitionSC, LLMLingua, LongLLMLingua, SCRL, KiSBBH
SummarizationSC, LLMLingua, LongLLMLingua, SCRL, KiSBBC, Arxiv, Gigaword, DUC2004, BNC, Broadcast, Google
LLMLingua, LongLLMLinguaLongBench
Question and AnswerSC, LLMLingua, LongLLMLingua, SCRL, KiSBBH
LLMLingua, LongLLMLinguaLongBench
Few-shot learningLLMLingua, LongLLMLinguaLongBench
Synthetic tasksLLMLingua, LongLLMLinguaLongBench
Code completionLLMLingua, LongLLMLinguaLongBench

How to start

git clone https://github.com/3DAgentWorld/Toolkit-for-Prompt-Compression.git
cd Toolkit-for-Prompt-Compression

Locate to the current folder and run:

pip install -r requirements.txt

Downloading models

Due to the file sizes, we do not expect this repository to be too large. Thus, please download the models manually. Most of the models can be automatically downloaded from Huggingface Hub. However, you should at least download models for SCRL method manually. Just follow the guidance inside /models folder.

How to use

For prompt compression tasks, follow pctoolkit/compressors.py, you can modify the compression methods as well as the parameters for them. There is an example in pctoolkit/compressors.py, it will be easy to modify.

Or you can follow the code below:

from pctoolkit.compressors import PromptCompressor

compressor = PromptCompressor(type='SCCompressor', device='cuda')

test_prompt = "test prompt"
ratio = 0.3
result = compressor.compressgo(test_prompt, ratio)
print(result)

For evaluation, follow pctoolkit_demo.py. Please note that if you want to change the metrics, modify pctoolkit/metrics.py, especially for LongBench dataset.

from pctoolkit.runners import run
from pctoolkit.datasets import load_dataset
from pctoolkit.metrics import load_metrics
from pctoolkit.compressors import PromptCompressor

compressor = PromptCompressor(type='SCCompressor', device='cuda')
dataset_name = 'arxiv'
dataset = load_dataset(dataset_name)

run(compressor=compressor, dataset=dataset, metrics=load_metrics, ratio=0.1)

Hint: Please do remember to fill in your Huggingface Tokens and API keys for OpenAI in pctoolkit/runners.py. (You can also change the URLs if you are using other APIs for OpenAI)

Reference

  1. Li, Yucheng et al. “Compressing Context to Enhance Inference Efficiency of Large Language Models.” Conference on Empirical Methods in Natural Language Processing (2023).

  2. Jiang, Huiqiang et al. “LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models.” Conference on Empirical Methods in Natural Language Processing (2023).

  3. Jiang, Huiqiang et al. “LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression.” ArXiv abs/2310.06839 (2023): n. pag.

  4. Ghalandari, Demian Gholipour et al. “Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning.” ArXiv abs/2205.08221 (2022): n. pag.

  5. Keep It Simple: Unsupervised Simplification of Multi-Paragraph Text (Laban et al., ACL-IJCNLP 2021)

Citation

If PCToolkit is used in your research or applications, please cite it using the following BibTeX:

@misc{li2024pctoolkit,
      title={PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models}, 
      author={Jinyi Li and Yihuai Lan and Lei Wang and Hao Wang},
      year={2024},
      eprint={2403.17411},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}