Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild
Paper β’ 2304.00451 β’ Published
Pre-converted ONNX models for No-Reference Image Quality Assessment (NR-IQA), exported from IQA-PyTorch and ready for deployment with ONNX Runtime (no PyTorch dependency needed).
| Model | Input Size | Files | Total Size | Score Range | Description |
|---|---|---|---|---|---|
| LIQE | 224Γ224 | clip_model.onnx + liqe_model.onnx + text_features.json |
~675 MB | 1β5 | CLIP-based learned quality & scene & distortion assessment |
| DBCNN | 224Γ224 | dbcnn_model.onnx |
~59 MB | 0β100 | Double-blind CNN with VGG16 + SCNN bilinear pooling |
| HyperIQA | 224Γ224 | hyperiqa_model.onnx |
~105 MB | 0β1 | Hyper-network based quality prediction on ResNet50 |
| MANIQA | 224Γ224 | maniqa_model.onnx |
~437 MB | continuous | Multi-dimension Attention Network with ViT backbone |
| MUSIQ | 224Γ224 | musiq_model.onnx |
~104 MB | 0β100 | Multi-scale Image Quality Transformer (single-scale export) |
| TReS | 224Γ224 | tres_model.onnx |
~575 MB | 0β100 | Transformer + ResNet ensemble with relative ranking |
| CLIPIQA | 224Γ224 | clipiqa_model.onnx |
~146 MB | 0β1 | CLIP-IQA+ with learned prompts and antialias RN50 |
Most models use external data format (
.onnx+.onnx.data). Both files must be in the same directory for loading.
import onnxruntime as ort
import numpy as np
from PIL import Image
# Load model
sess = ort.InferenceSession("dbcnn_model.onnx")
# Preprocess: resize to 224x224, normalize with ImageNet mean/std
img = Image.open("test.jpg").convert("RGB").resize((224, 224))
x = np.array(img).astype(np.float32) / 255.0
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
x = (x - mean) / std
x = x.transpose(2, 0, 1)[np.newaxis] # (1, 3, 224, 224)
# Run inference
score = sess.run(None, {"input": x.astype(np.float32)})[0]
print(f"Quality score: {score.item():.4f}")
| Model | Normalization | Notes |
|---|---|---|
| DBCNN, HyperIQA, TReS | ImageNet: mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] |
Standard ImageNet |
| MANIQA | Inception: mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5] |
|
| MUSIQ | [-1, 1]: (x - 0.5) / 0.5 |
Equivalent to Inception |
| CLIPIQA | CLIP: mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711] |
|
| LIQE | CLIP: same as CLIPIQA | Two-stage: CLIP encoder β LIQE scorer |
For multi-crop inference (recommended for HyperIQA, MANIQA, TReS), extract multiple 224Γ224 crops from the full-resolution image and average the scores.
LIQE requires a two-stage pipeline:
import json
import onnxruntime as ort
import numpy as np
# Load both models and text features
clip_sess = ort.InferenceSession("clip_model.onnx")
liqe_sess = ort.InferenceSession("liqe_model.onnx")
with open("text_features.json") as f:
text_features = np.array(json.load(f), dtype=np.float32)
# Step 1: Extract image features with CLIP
img_input = preprocess_clip(image) # (1, 3, 224, 224)
img_features = clip_sess.run(None, {"input": img_input})[0]
# Step 2: Score with LIQE
score = liqe_sess.run(None, {
"image_features": img_features,
"text_features": text_features
})[0]
All models verified against PyTorch source (absolute difference < 0.01):
| Model | PyTorch Score | ONNX Score | Abs Diff |
|---|---|---|---|
| LIQE | β | β | < 0.001 |
| DBCNN | β | β | 0.000011 |
| HyperIQA | β | β | 0.00000003 |
| MANIQA | β | β | 0.000019 |
| MUSIQ | β | β | 0.000025 |
| TReS | β | β | 0.000439 |
| CLIPIQA | β | β | 0.000002 |
| Model | Paper | Conference | ONNX Files | Export Notes |
|---|---|---|---|---|
| LIQE | Blind Image Quality Assessment via Vision-Language Correspondence | CVPR 2023 | clip_model.onnx liqe_model.onnx text_features.json |
Two-stage: CLIP ViT-B/32 image encoder + LIQE scoring head; text features pre-encoded to JSON |
| DBCNN | Blind Image Quality Assessment Using A Deep Bilinear CNN | IEEE TCSVT 2020 | dbcnn_model.onnx |
VGG16 + SCNN with bilinear pooling; both sub-networks exported as single model |
| HyperIQA | Blindly Assess Image Quality in the Wild Boosted by A Large-scale Database | CVPR 2020 | hyperiqa_model.onnx |
ResNet50 backbone with hyper-network; exported forward patch path only |
| MANIQA | MANIQA: Multi-dimension Attention Network for No-Reference Image Quality Assessment | CVPR 2022 NTIRE | maniqa_model.onnx |
ViT-B/8 backbone; single-file export (no external data) |
| MUSIQ | MUSIQ: Multi-scale Image Quality Transformer | ICCV 2021 | musiq_model.onnx |
Simplified to single-scale 224Γ224 input (original uses multi-scale patches) |
| TReS | No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency | WACV 2022 | tres_model.onnx |
Eval-only path; dual-path consistency and flip branches removed |
| CLIPIQA | Exploring CLIP for Assessing the Look and Feel of Images | AAAI 2023 | clipiqa_model.onnx |
CLIP-IQA+ variant with learned prompts baked into the model; antialias RN50 backbone |
clip_model.onnx # LIQE CLIP image encoder
clip_model.onnx.data
liqe_model.onnx # LIQE scoring head
liqe_model.onnx.data
text_features.json # LIQE pre-encoded text prompts
dbcnn_model.onnx # DBCNN
dbcnn_model.onnx.data
hyperiqa_model.onnx # HyperIQA
hyperiqa_model.onnx.data
maniqa_model.onnx # MANIQA (single file, no .data)
musiq_model.onnx # MUSIQ
musiq_model.onnx.data
tres_model.onnx # TReS
tres_model.onnx.data
clipiqa_model.onnx # CLIPIQA+
clipiqa_model.onnx.data
If you use these models, please cite the original papers and the IQA-PyTorch toolbox:
@article{chaofeng2022iqapytorch,
title={IQA-PyTorch: PyTorch Toolbox for Image Quality Assessment},
author={Chaofeng Chen and Jiadi Mo},
year={2022},
journal={arXiv preprint arXiv:2208.14818}
}
Apache 2.0. Original model weights are subject to their respective licenses.