Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models
- π€ Machine-Centric: We bypass human perception to evaluate images from the perspective of the deep learning models that use them.
- π Task-Driven Metrics: Directly measure how degradations like blur, noise, or compression artifacts impact the performance of downstream vision tasks.
- π‘ A New Paradigm: MIQA offers a new lens for optimizing image processing pipelines where machines make the final decision.
β¨ Does MIQA Work?
![]() |
![]() |
![]() |
Performance improvement across tasks when filtering low-quality images using MIQA scores
ποΈ Key Results
Our results provide clear evidence of MIQA's effectiveness across three representative computer vision tasks: classification, detection, and segmentation. The framework consistently identifies images that degrade model performance. By filtering these detrimental samples, MIQA directly leads to improved outcomes and demonstrates the universal utility of a machine-centric approach. This transforms quality assessment from a passive metric into a proactive tool, safeguarding downstream models against the unpredictable image quality of real-world conditions and ensuring robust performance when it matters most.π οΈ Installation Guide
Step 1: Install Dependencies
To get started, you'll need to install two essential libraries: mmcv and mmsegmentation.
Install mmcv and mmsegmentation
For the latest version of mmsegmentation, follow the installation guide here: MMsegmentation Installation Guide
Alternatively, you can install a specific version of mmsegmentation based on your CUDA and PyTorch versions. You can find the version compatibility details here: MMCV Installation Guide
Step 2: Handle CUDA Version Compatibility
If your CUDA version is relatively high, such as 12.7 or higher, you might encounter a version mismatch with mmcv. In this case, you may need to install a compatible version of mmcv.
Install a compatible version of mmcv
For example, if you need a specific version of mmcv, you can uninstall the existing versions and install a compatible one as follows:
pip uninstall mmcv mmcv-full -y
mim install "mmcv>=2.0.0rc4,<2.2.0" # The version specified here is just an example. You should choose a version that is compatible with your CUDA and PyTorch setup.*
Step 3: Install Required Libraries
pip install -r requirements.txt
π¦ Model Weights & Performance
Where things live
| Role | Location |
|---|---|
| Application code (training, inference, evaluation) | This GitHub repository: github.com/XiaoqiWang/MIQA |
| Published RA-MIQA checkpoints (9 files) | Hugging Face model repo: xiaoqi-wang/miqa |
| MIQD-2.5M database | Hugging Face dataset: xiaoqi-wang/miqd-2.5m |
Naming & cache
Checkpoint pattern on the Hub: miqa_ra_miqa_{cls|det|ins}_{composite|consistency|accuracy}_metric.pth.tar
Examples:
miqa_ra_miqa_cls_composite_metric.pth.tarmiqa_ra_miqa_det_consistency_metric.pth.tarmiqa_ra_miqa_ins_accuracy_metric.pth.tar
On first run, huggingface_hub downloads into models/checkpoints/{composite|consistency|accuracy}_metric/.
π Quick Start
Assess a Single Image
Run MIQA inference for a single image using command-line interface:
# Evaluate a single image for classification-oriented MIQA
python img_inference.py --input path/to/image.jpg --task cls --model ra_miqa
Evaluate a Directory of Images
Process all images within a directory
# Assess all images in a directory (e.g., detection-oriented MIQA)
python img_inference.py --input ./assets/demo_images/coco_demo --task det --model ra_miqa
Save Results and Visualizations
To save outputs and generate visualized results:
# Save the predicted scores and visualization for a single image
python img_inference.py --input path/to/image.jpg --task cls --model ra_miqa --save-results --visualize
# Save batch results and generate visualization for a directory
python img_inference.py --input ./assets/demo_images/imagenet_demo --task ins --save-results --visualize
π¬ Video Assessment
Video Quality Assessment offers two workflows: (1) Frame-by-Frame Annotation: Generates fully annotated videos for detailed visual inspection. Suitable for demos and qualitative analysis but computationally intensive. (2) Selective Sampling & Aggregation: Samples frames to produce plots and structured data (.json) for efficient, quantitative analysis. Ideal for batch processing and reporting.
Analyze a Single Video (Frame-by-Frame Annotation)
Run MIQA video inference for one video and save the annotated output.
# Evaluate a single video using RA-MIQA (classification-oriented MIQA)
python video_annotator_inference.py --input assets/demo_video/brightness_distorted.mp4 --task cls --model ra_miqa
Evaluate a Directory of Videos (Frame-by-Frame Annotation)
Process all videos within a given folder:
# Assess all videos in a directory for object detection-oriented MIQA
python video_annotator_inference.py --input assets/demo_video/ --task det --model ra_miqa
The primary output is a new .mp4 video file. This video shows the original footage playing alongside a dynamic side panel that displays the real-time quality score and a line chart that grows as the video progresses.
π₯ Example: Frame-wise MIQA Predictions on Videos
| Brightness Variation | Compression Artifacts | Minimal Perceptual Distortion |
Analyze a Single Video (Selective Sampling & Aggregation)
For efficient, quantitative analysis, this script samples frames from the video instead of processing all of them. It is significantly faster and designed for generating analytical reports.
# Analyze a video, sample frames, and create a dual-granularity plot
python video_analytics_inference.py --input assets/demo_video/gaussian_distorted.mp4 --task ins --visualize --viz-granularity both
Evaluate a Directory of Videos (Selective Sampling & Aggregation)
This workflow is highly optimized for batch processing.
# Analyze all videos in a directory, sampling 120 frames from each
python video_analytics_inference.py --input assets/demo_video/ --task det --video-frames 120 --visualize
python video_analytics_inference.py --input assets/demo_video/jpeg_distorted.mp4 --task det --visualize --viz-granularity both
# viz-granularity both : Specifies the type of plot to generate. 'composite' creates a comprehensive, side-by-side comparison chart showing:
#1. The raw, frame-level quality scores. 2. The smoothed, per-second average quality scores.
This process does not create a new video. Instead, it generates two key outputs for each video analyzed:
- A
.pngimage: A detailed time-series plot showing the quality score fluctuation over the video's duration. - A
.jsonfile: A structured data file containing per-second aggregated scores, overall statistics (average, min, max, std. dev), and video metadata.
π₯ Example: Frame-wise MIQA Predictions on Videos
| Brightness Variation | Compression Artifacts | Minimal Perceptual Distortion |
![]() |
![]() |
![]() |
π Training and Evaluation
Training
CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset 'miqa_cls' \
--path_miqa_cls 'path/to/datasets_miqa_cls' \
--train_split_file '../data/dataset_splitting/miqa_cls_train.csv' \
--val_split_file '../data/dataset_splitting//miqa_cls_val.csv' \
--metric_type 'composite' --loss_name 'mse' --is_two_transform \
-a 'RA-MIQA' --pretrained --transform_type 'simple_transform' \
-b 256 --epochs 5 --warmup_epochs 1 --validate_num 2 --lr 1e-4 \
--image_size 288 --crop_size 224 --workers 8 -p 100 \
--multiprocessing-distributed --world-size 1 --rank 0
More training scripts are available in the "scripts" directory.
Evaluation on Standard Benchmarks
# Evaluate on miqa_cls val set
python evaluate.py --model_name ra_miqa --train_dataset cls --test_dataset cls --metric_type composite
# Cross-dataset evaluation: evaluate the RA-MIQA model trained on miqa_cls dataset and tested on miqa_det dataset
python evaluate.py --model_name ra_miqa --train_dataset cls --test_dataset det --metric_type composite
π Benchmarks
Tabel 1: Performance Benchmark on Composite Score
| Category | Method | Image Classification | Object Detection | Instance Segmentation | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SRCC β | PLCC β | KRCC β | RMSE β | SRCC β | PLCC β | KRCC β | RMSE β | SRCC β | PLCC β | KRCC β | RMSE β | ||
| HVS-based | PSNR | 0.2388 | 0.2292 | 0.1661 | 0.2928 | 0.3176 | 0.3456 | 0.2148 | 0.2660 | 0.3242 | 0.3530 | 0.2196 | 0.2553 |
| SSIM | 0.3027 | 0.2956 | 0.2119 | 0.2874 | 0.4390 | 0.4505 | 0.3011 | 0.2531 | 0.4391 | 0.4512 | 0.3011 | 0.2435 | |
| VSI | 0.3592 | 0.3520 | 0.2520 | 0.2816 | 0.4874 | 0.4940 | 0.3355 | 0.2465 | 0.4919 | 0.4985 | 0.3392 | 0.2365 | |
| LPIPS | 0.3214 | 0.3280 | 0.2258 | 0.2842 | 0.5264 | 0.5376 | 0.3697 | 0.2390 | 0.5342 | 0.5453 | 0.3754 | 0.2287 | |
| DISTS | 0.3878 | 0.3804 | 0.2724 | 0.2782 | 0.5266 | 0.5352 | 0.3659 | 0.2395 | 0.5363 | 0.5450 | 0.3738 | 0.2288 | |
| HyperIQA | 0.2496 | 0.2279 | 0.1741 | 0.2929 | 0.4462 | 0.4463 | 0.3031 | 0.2537 | 0.4456 | 0.4518 | 0.3031 | 0.2434 | |
| MANIQA | 0.3403 | 0.3255 | 0.2387 | 0.2844 | 0.4574 | 0.4617 | 0.3124 | 0.2515 | 0.4636 | 0.4680 | 0.3176 | 0.2411 | |
| Machine-based | ResNet-18 | 0.5131 | 0.5427 | 0.3715 | 0.2527 | 0.7541 | 0.7734 | 0.5625 | 0.1797 | 0.7582 | 0.7790 | 0.5674 | 0.1711 |
| ResNet-50 | 0.5581 | 0.5797 | 0.4062 | 0.2451 | 0.7743 | 0.7925 | 0.5824 | 0.1729 | 0.7729 | 0.7933 | 0.5826 | 0.1661 | |
| EfficientNet-b1 | 0.5901 | 0.6130 | 0.4320 | 0.2377 | 0.7766 | 0.7950 | 0.5859 | 0.1720 | 0.7808 | 0.7999 | 0.5918 | 0.1637 | |
| EfficientNet-b5 | 0.6330 | 0.6440 | 0.4680 | 0.2301 | 0.7866 | 0.8041 | 0.5971 | 0.1685 | 0.7899 | 0.8074 | 0.6013 | 0.1610 | |
| ViT-small | 0.5998 | 0.6161 | 0.4407 | 0.2370 | 0.7992 | 0.8142 | 0.6099 | 0.1646 | 0.7968 | 0.8139 | 0.6083 | 0.1585 | |
| RA-MIQA | 0.7003 |
0.6989 |
0.5255 |
0.2152 |
0.8125 |
0.8264 |
0.6263 |
0.1596 |
0.8188 |
0.8340 |
0.6333 |
0.1505 |
|
Table 2: Consistency & Accuracy Score Benchmark
| Method | Image Classification | Object Detection | Instance Segmentation | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy Score | Consistency Score | Accuracy Score | Consistency Score | Accuracy Score | Consistency Score | |||||||||||||
| SRCC β | PLCC β | RMSE β | SRCC β | PLCC β | RMSE β | SRCC β | PLCC β | RMSE β | SRCC β | PLCC β | RMSE β | SRCC β | PLCC β | RMSE β | SRCC β | PLCC β | RMSE β | |
| HVS-based Methods | ||||||||||||||||||
| PSNR | 0.2034 | 0.1620 | 0.3541 | 0.2927 | 0.2812 | 0.2692 | 0.2234 | 0.2449 | 0.2747 | 0.3712 | 0.3933 | 0.2839 | 0.2182 | 0.2398 | 0.2616 | 0.3796 | 0.4061 | 0.2770 |
| SSIM | 0.2529 | 0.2101 | 0.3509 | 0.3740 | 0.3663 | 0.2610 | 0.3434 | 0.3419 | 0.2662 | 0.5128 | 0.5130 | 0.2651 | 0.3271 | 0.3284 | 0.2545 | 0.5174 | 0.5204 | 0.2589 |
| VSI | 0.3020 | 0.2515 | 0.3473 | 0.4392 | 0.4336 | 0.2528 | 0.3799 | 0.3685 | 0.2634 | 0.5700 | 0.5571 | 0.2565 | 0.3703 | 0.3645 | 0.2509 | 0.5757 | 0.5749 | 0.2481 |
| LPIPS | 0.2680 | 0.2355 | 0.3488 | 0.3927 | 0.4032 | 0.2567 | 0.4064 | 0.3987 | 0.2598 | 0.6196 | 0.6232 | 0.2415 | 0.3972 | 0.3941 | 0.2476 | 0.6300 | 0.6344 | 0.2344 |
| DISTS | 0.3291 | 0.2768 | 0.3448 | 0.4683 | 0.4628 | 0.2487 | 0.4089 | 0.3999 | 0.2597 | 0.6174 | 0.6178 | 0.2429 | 0.4069 | 0.4012 | 0.2468 | 0.6255 | 0.6270 | 0.2362 |
| HyperIQA | 0.2100 | 0.1649 | 0.3540 | 0.2966 | 0.2777 | 0.2695 | 0.3646 | 0.3545 | 0.2649 | 0.5009 | 0.4943 | 0.2684 | 0.3486 | 0.3442 | 0.2530 | 0.5056 | 0.4995 | 0.2626 |
| MANIQA | 0.2924 | 0.2435 | 0.3481 | 0.3963 | 0.3870 | 0.2587 | 0.3839 | 0.3823 | 0.2618 | 0.4991 | 0.4975 | 0.2679 | 0.3755 | 0.3749 | 0.2498 | 0.5096 | 0.5098 | 0.2608 |
| Machine-based Methods | ||||||||||||||||||
| ResNet-50 | 0.4734 | 0.4411 | 0.3221 | 0.5989 | 0.6551 | 0.2119 | 0.6955 | 0.6898 | 0.2051 | 0.8252 | 0.8457 | 0.1648 | 0.6863 | 0.6847 | 0.1964 | 0.8320 | 0.8480 | 0.1607 |
| EfficientNet-b5 | 0.5586 | 0.5149 | 0.3076 | 0.6774 | 0.7168 | 0.1956 | 0.7042 | 0.6991 | 0.2026 | 0.8353 | 0.8530 | 0.1612 | 0.6933 | 0.6949 | 0.1938 | 0.8419 | 0.8564 | 0.1565 |
| ViT-small | 0.5788 | 0.5197 | 0.3066 | 0.6798 | 0.7189 | 0.1950 | 0.7121 | 0.7052 | 0.2008 | 0.8459 | 0.8620 | 0.1566 | 0.7168 | 0.7146 | 0.1885 | 0.8487 | 0.8616 | 0.1539 |
| RA-MIQA | 0.6573 |
0.5823 |
0.2917 |
0.7707 |
0.7866 |
0.1732 |
0.7448 |
0.7370 |
0.1915 |
0.8526 |
0.8692 |
0.1527 |
0.7363 |
0.7327 |
0.1834 |
0.8632 |
0.8756 |
0.1464 |
π Citation
If you find this work useful in your research, please consider citing:
@article{wang2025miqa,
title={Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models},
author={Wang, Xiaoqi and Zhang, Yun and Lin, Weisi},
journal={arXiv preprint arXiv:2508.19850},
year={2025}
}
π§ Contact
- Project Maintainer: Xiaoqi Wang
- Issues: Please use GitHub Issues for bug reports and feature requests
- Downloads last month
- 9





