[ACM MM 2024] QPT V2: Masked Image Modeling Advances Visual Scoring

¹Tsinghua University, ²Kuaishou Technology.

👁️ Overview

Quality assessment and aesthetics assessment aim to evaluate the perceived quality and aesthetics of visual content. Current learning-based methods suffer greatly from the scarcity of labeled data and usually perform sub-optimally in terms of generalization. Although masked image modeling (MIM) has achieved noteworthy advancements across various high-level tasks (e.g., classification, detection etc.). In this work, we take on a novel perspective to investigate its capabilities in terms of quality- and aesthetics-awareness. To this end, we propose Quality- and aesthetics-aware PreTraining (QPT V2), the first pretraining framework based on MIM that offers a unified solution to quality and aesthetics assessment. To perceive the high-level semantics and fine-grained details, pretraining data is curated. To comprehensively encompass quality- and aesthetics-related factors, degradation is introduced. To capture multi-scale quality and aesthetic information, model structure is modified. Extensive experimental results on 11 downstream benchmarks clearly show the superior performance of QPT V2 in comparison with current state-of-the-art approaches and other pretraining paradigms.

📜 Updates

[2024/7/16] QPT V2 was accepted by ACM MM 2024!

👨‍💻 Todo

Checkpoints of QPT V2, including IQA & VQA & IAA.
[] Inference code of QPT V2.
[] Training code of QPT V2.

✒️ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and a citation 📝

@inproceedings{qpt,
  author       = {Kai Zhao and
                  Kun Yuan and
                  Ming Sun and
                  Mading Li and
                  Xing Wen},
  title        = {Quality-aware Pretrained Models for Blind Image Quality Assessment},
  booktitle    = {{CVPR}},
  pages        = {22302--22313},
  publisher    = {{IEEE}},
  year         = {2023}
}

@inproceedings{qptv2,
  author       = {Qizhi Xie and
                  Kun Yuan and
                  Yunpeng Qu and
                  Mingda Wu and
                  Ming Sun and
                  Chao Zhou and
                  Jihong Zhu},
  title        = {{QPT-V2:} Masked Image Modeling Advances Visual Scoring},
  booktitle    = {{ACM} Multimedia},
  pages        = {2709--2718},
  publisher    = {{ACM}},
  year         = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
checkpoints		checkpoints
figures		figures
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ACM MM 2024] QPT V2: Masked Image Modeling Advances Visual Scoring

👁️ Overview

📜 Updates

👨‍💻 Todo

✒️ Citation

About

Releases

Packages

KeiChiTse/QPT-V2

Folders and files

Latest commit

History

Repository files navigation

[ACM MM 2024] QPT V2: Masked Image Modeling Advances Visual Scoring

👁️ Overview

📜 Updates

👨‍💻 Todo

✒️ Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages