SA-IQA Model

SA-IQA is a multimodal image quality assessment model released with β€œBeyond Pixels: Benchmarking and Reward-Based Assessing Framework for Visual Spatial Aesthetics.”

The released final checkpoint is sa-iqa-prompt4, a fine-tuned model based on Ovis2.5-9B for assessing interior-image spatial aesthetics.

Hugging Face Release Layout

This Hugging Face repository is released as a full model bundle. Download the whole repository to ./SA-IQA-model when using it with the SA-IQA codebase.

The sa-iqa-prompt4/ directory is the released final fine-tuned checkpoint for inference. The Ovis2.5-9B/ directory is the bundled base model copy used by tools/train_sft.sh for training and reproducibility.

Because this repository contains two model directories, automatic loading from the repository root is not expected to work. Load the fine-tuned checkpoint from SA-IQA-model/sa-iqa-prompt4, or pass that path through the SA-IQA inference script with --model_path.

Model Details

Model Description

  • Model type: multimodal vision-language model for image quality assessment
  • Base model: Ovis2.5-9B
  • Fine-tuned checkpoint: sa-iqa-prompt4
  • Input: image plus a dimension-specific text prompt
  • Output: textual quality label and token log-probabilities used to compute a continuous score
  • Dimensions: distortion, harmony, layout, lighting

Intended Use

SA-IQA is intended for research, evaluation, and application use, including:

  • spatial aesthetic assessment of interior images
  • image quality benchmarking on SA-BENCH
  • reward-model research for image generation and best-of-N selection
  • comparison of prompt variants for spatial aesthetic assessment

Out-of-Scope Use

The model is not intended for:

  • universal aesthetic judgment outside the interior-scene domain
  • safety-critical or legally binding decision making

Usage

Use the SA-IQA inference script from the code repository:

python tools/infer.py --prompt_version 4 --mode all --dimension lighting

When running from the release bundle root, the default model path is:

SA-IQA-model/sa-iqa-prompt4

If you downloaded this Hugging Face repository to another local path, pass the nested sa-iqa-prompt4 checkpoint path through --model_path.

Release Bundle Structure

SA-IQA-model/
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
β”œβ”€β”€ Ovis2.5-9B/                  # Base model used by training scripts
β”‚   β”œβ”€β”€ LICENSE
β”‚   β”œβ”€β”€ NOTICE
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ modeling_ovis2_5.py
β”‚   β”œβ”€β”€ model-00001-of-00004.safetensors
β”‚   β”œβ”€β”€ model-00002-of-00004.safetensors
β”‚   β”œβ”€β”€ model-00003-of-00004.safetensors
β”‚   β”œβ”€β”€ model-00004-of-00004.safetensors
β”‚   └── ...
└── sa-iqa-prompt4/              # Fine-tuned checkpoint used for inference
    β”œβ”€β”€ config.json
    β”œβ”€β”€ modeling_ovis2_5.py
    β”œβ”€β”€ model-00001-of-00004.safetensors
    β”œβ”€β”€ model-00002-of-00004.safetensors
    β”œβ”€β”€ model-00003-of-00004.safetensors
    β”œβ”€β”€ model-00004-of-00004.safetensors
    └── ...

Training Data

The model is fine-tuned and evaluated on SA-BENCH, a 17,768-example benchmark for spatial aesthetics in interior scenes.

Limitations

  • The model is designed for interior images and may not generalize to other image domains.
  • Predictions are based on the SA-BENCH annotation protocol and prompt design.
  • The output should be treated as an assessment signal, not as a definitive human aesthetic judgment.

License

The released SA-IQA model weights are licensed under the Apache License 2.0. See LICENSE for the full license text.

This model is fine-tuned from Ovis2.5-9B, which is also released under the Apache License 2.0. When redistributing or modifying this model, retain attribution and relevant notices from the base model:

  • Ovis2.5-9B/LICENSE
  • Ovis2.5-9B/NOTICE

Citation

If you use this model, please cite:

@inproceedings{gao2025beyond,
  title={Beyond Pixels: Benchmarking and Reward-Based Assessing Framework for Visual Spatial Aesthetics},
  author={Gao, Yuan and Song, Jin and Fei, Yiyun and Li, Gongzhe and Yang, Ruigao},
  booktitle={CVPR 2025 Workshop},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AliHome3D/SA-IQA-model

Finetuned
(2)
this model