Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ pipeline_tag: image-text-to-text
|
|
| 14 |
💻 **Code**: [GitHub Repository](https://github.com/yahya-ben/mplug2-vp-for-nriqa)
|
| 15 |
|
| 16 |
## Abstract
|
| 17 |
-
In this paper, we propose a novel
|
| 18 |
|
| 19 |
## Overview
|
| 20 |
Pre-trained visual prompt checkpoints for **No-Reference Image Quality Assessment (NR-IQA)** using mPLUG-Owl2-7B. Achieves competitive performance with only **~600K parameters** vs 7B+ for full fine-tuning.
|
|
|
|
| 14 |
💻 **Code**: [GitHub Repository](https://github.com/yahya-ben/mplug2-vp-for-nriqa)
|
| 15 |
|
| 16 |
## Abstract
|
| 17 |
+
In this paper, we propose a novel approach to No-Reference Image Quality Assessment (NR-IQA) by efficiently adapting a Multimodal Large Language Model (MLLM) through pixel-space visual prompts. Unlike full fine-tuning approaches that adapt MLLMs to specific tasks, our method trains only ∼600K parameters at most (<0.01% of the base model), while keeping the underlying model fully frozen. During inference, these visual prompts are combined with images via addition and processed by mPLUG-Owl2 with the textual query “Rate the technical quality of the image.” Evaluations across distortion types (synthetic, realistic, AI-generated) on KADID-10k, KonIQ-10k, and AGIQA-3k demonstrate competitive performance against full finetuned methods and specialized NR-IQA models, achieving 0.93 SRCC on KADID-10k. The source code is publicly available at https: // github. com/ yahya-ben/ mplug2-vp-for-nriqa .
|
| 18 |
|
| 19 |
## Overview
|
| 20 |
Pre-trained visual prompt checkpoints for **No-Reference Image Quality Assessment (NR-IQA)** using mPLUG-Owl2-7B. Achieves competitive performance with only **~600K parameters** vs 7B+ for full fine-tuning.
|