Qizhi Xie1,2 | Kun Yuan2 | Yunpeng Qu1,2 | Mingda Wu2 | Ming Sun2 | Chao Zhou2 | Jihong Zhu1
1Tsinghua University, 2Kuaishou Technology.
Quality assessment and aesthetics assessment aim to evaluate the perceived quality and aesthetics of visual content. Current learning-based methods suffer greatly from the scarcity of labeled data and usually perform sub-optimally in terms of generalization. Although masked image modeling (MIM) has achieved noteworthy advancements across various high-level tasks (e.g., classification, detection etc.). In this work, we take on a novel perspective to investigate its capabilities in terms of quality- and aesthetics-awareness. To this end, we propose Quality- and aesthetics-aware PreTraining (QPT V2), the first pretraining framework based on MIM that offers a unified solution to quality and aesthetics assessment. To perceive the high-level semantics and fine-grained details, pretraining data is curated. To comprehensively encompass quality- and aesthetics-related factors, degradation is introduced. To capture multi-scale quality and aesthetic information, model structure is modified. Extensive experimental results on 11 downstream benchmarks clearly show the superior performance of QPT V2 in comparison with current state-of-the-art approaches and other pretraining paradigms.
[2024/7/16] QPT V2 was accepted by ACM MM 2024!
- Checkpoints of QPT V2, including IQA & VQA & IAA.
- [] Inference code of QPT V2.
- [] Training code of QPT V2.
If you find our work helpful for your research, please consider giving a star ⭐ and a citation 📝
@inproceedings{qpt,
author = {Kai Zhao and
Kun Yuan and
Ming Sun and
Mading Li and
Xing Wen},
title = {Quality-aware Pretrained Models for Blind Image Quality Assessment},
booktitle = {{CVPR}},
pages = {22302--22313},
publisher = {{IEEE}},
year = {2023}
}
@inproceedings{qptv2,
author = {Qizhi Xie and
Kun Yuan and
Yunpeng Qu and
Mingda Wu and
Ming Sun and
Chao Zhou and
Jihong Zhu},
title = {{QPT-V2:} Masked Image Modeling Advances Visual Scoring},
booktitle = {{ACM} Multimedia},
pages = {2709--2718},
publisher = {{ACM}},
year = {2024}
}