Home

Awesome

Layerwise Quantization

We show that we can achieve quantization at a dynamic bit-level by doing per-layer quantization.

The code will be available here in the near future.

The paper is available at: https://arxiv.org/abs/2406.17415 and it is in review for EMNLP 2024.

If you decide to use please consider citing it using:

@misc{dumitru2024layerwisequantizationpragmaticeffective,
      title={Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels}, 
      author={Razvan-Gabriel Dumitru and Vikas Yadav and Rishabh Maheshwary and Paul-Ioan Clotan and Sathwik Tejaswi Madhusudhan and Mihai Surdeanu},
      year={2024},
      eprint={2406.17415},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.17415}, 
}