Home

Awesome

<div align="center">

Q-Ground: Image Quality Grounding with Large Multi-modality Models

<sup>1</sup>Chaofeng Chen, <sup>1</sup>Sensen Yang, <sup>1</sup>Haoning Wu, <sup>1</sup>Liang Liao, <sup>3</sup>Zicheng Zhang, <sup>1</sup>AnnanWang, <sup>2</sup>Wenxiu Sun, <sup>2</sup>Qiong Yan, <sup>1</sup>Weisi Lin
<sup>1</sup>S-Lab, Nanyang Technological University, <sup>2</sup>Sensetime Research, <sup>3</sup>Shanghai Jiao Tong University

arXiv <a href="https://huggingface.co/datasets/chaofengc/QGround-100K"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-yellow"></a> arXiv Hits

</div>

teaser_img

TODO List

✅ Release datasets in 🤗Hugging Face QGround-100K
⬜ Release test codes
⬜ Release training codes

Citation

If you find this work useful, please consider to cite our paper:

@inproceedings{chen2024qground,
      title={Q-Ground: Image Quality Grounding with Large Multi-modality Models}, 
      author={Chaofeng Chen and Sensen Yang and Haoning Wu and Liang Liao and Zicheng Zhang and Annan Wang and Wenxiu Sun and Qiong Yan and Weisi Lin},
      Journal = {ACM International Conference on Multimedia},
      year={2024},
}

Acknowledgement

This project is based on PixelLM, LISA and LLaVA. Thanks to the authors for their great work!