Home

Awesome

DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs

<h5 align="center">

arXiv Website License <br>

</h5>

Welcome to the official code repository for "DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs (NeurIPS 2024, Oral)".

🔍 For more details, please refer to the project page: https://duquant.github.io/.

📰 News

👀 Introduction

duquant

🔧 Installation

conda create -n duquant python=3.10 -y
conda activate duquant
git clone https://github.com/Hsu1023/DuQuant.git
pip install --upgrade pip 
pip install -r requirements.txt

⚙️ Usage

1. Preprocessing

python get_rot.py # need to be run only once for all models
python generate_act_scale_shift.py --model PATH_OF_MODEL # need to be run only once for each model (path can be hugging-face hub path or relative path)

2. Quantization

The bash script for DuQuant can be found in run.sh. You can choose the model to be quantized by providing model path after --model order. To evaluate DuQuant + lwc method, you can run run_lwc.sh script. In addition, you can add --save_dir to save the quantized models, and use --resume to reload the saved models.

Explanation of arguments:

3. Model Zoo

Currently, we support LLaMA series (LLaMA 1, 2 and 3), Vicuna series, and Mistral models.

Models7B/8B13B30B65B/70B
LLaMA1
LLaMA2---
LLaMA3------
Vicuna-v1.5------
Mistral---------

📜 Result

📂 Contact

For immediate queries or further information, please open an issue or contact xuhb20@mails.tsinghua.edu.cn or haokun.lin@cripac.ia.ac.cn.

🙏 Acknowledgement

This repo is built upon the following projects:

We thank the authors for their code.

📝 Citation

We kindly request that you cite our work if you utilize the code or reference our findings in your research:

<!-- Please cite our work if you use our code or discuss our findings in your own research: -->
@article{lin2024duquant,
  title={DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs},
  author={Lin, Haokun and Xu, Haobo and Wu, Yichen and Cui, Jingzhi and Zhang, Yingtao and Mou, Linzhan and Song, Linqi and Sun, Zhenan and Wei, Ying},
  journal={arXiv preprint arXiv:2406.01721},
  year={2024}
}