Awesome
IP-IQA
[ICME2024, Official Code] for paper "Bringing Textual Prompt to AI-Generated Image Quality Assessment".
<br> <details open><summary>💡 I also have other project that may interest you ✨. </summary><p> <!-- may -->TriVQA <br> CVPRW2024, the 3rd-place winner of the NTIRE 2024 Quality Assessment for AI-Generated Content - Track 2 Video. <br> <br> MPP-Qwen-Next <br> My personal project, about traing 8B/14B MLLM on RTX 3090/4090 24GB by DeepSpeed Pipeline Parallel. Support {image/video/multi-image} input.<br>
</p></details>
Installation
You can use conda
to configure the virtual environment with only three lines of commands. As following:
conda create -n ipiqa python=3.9
conda activate ipiqa
pip install -e .
Weights & Data
CLIP ResNet50 weights
Download Link: RN50.pt
After that, you can place it to cache/ckpt/clip/openai/resnet/RN50.pt
or modify the base_ckpt
in yaml file like ipiqa.yaml.
AGIQA-1k Database
Please get the data refer to its Official Repo.
After that, please set your path in dataset path and images root of the yaml file.
AGIQA-3k Database
Please get the data refer to its Official Repo.
After that, please set your path in dataset path and images root of the yaml file.
Additionally, you need to get a mos_joint.xlsx
file, which is organized like following:
Data Organization for Reference
├── cache
│ |── data
│ | ├── aigc_qa_3k # AGIQA-3k
│ │ │ ├── AGIQA-3k # the vis_root
│ │ │ | ├── xxx.jpg
│ │ │ ├── mos_joint.xlsx
│ │ │ ├── data.csv
│ │ │ |
│ │ │ ├── aigc_QA_data1 # AGIQA-1k
│ │ │ | ├── AGIQA-1k-Database-main # git clone their repo
│ │ │ | ├── images # the vis_root
Train & K-folds Evaluation
AGIQA-1k
run:
python train_agiqa1k.py --cfg-path ipiqa/projects/agiqa1k/ipiqa.yaml --num_cv 10
DDP:
python -m torch.distributed.run --nproc_per_node 2 train_agiqa1k.py --cfg-path ipiqa/projects/agiqa1k/ipiqa.yaml --num_cv 10
AGIQA-3k
run:
python train_agiqa3k.py --cfg-path ipiqa/projects/agiqa3k/ipiqa.yaml --num_cv 10
DDP:
python -m torch.distributed.run --nproc_per_node 2 train_agiqa3k.py --cfg-path ipiqa/projects/agiqa3k/ipiqa.yaml --num_cv 10
Acknowledgement
- MPP-Qwen: My personal MLLM Project. The
trainer
and prototype of this repo is a reference to it. - LAVIS: An excellent repo for multimodal learning. Refer to its
Trainer
implementation. - AGIQA-1k and AGIQA-3k: Thanks to their database!
- OpenAI-CLIP: Use their pretrained weights.
Citation
@misc{qu2024bringingtextualpromptaigenerated,
title={Bringing Textual Prompt to AI-Generated Image Quality Assessment},
author={Bowen Qu and Haohui Li and Wei Gao},
year={2024},
eprint={2403.18714},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2403.18714},
}