MIRO β ablations and single-reward specialists
This repository hosts the 15 ablation / baseline checkpoints that accompany the main MIRO release at nicolas-dufour/miro.
Dufour, Degeorge, Ghosh, Kalogeiton, Picard. MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency. ICML 2026.
π Paper Β· π Project page Β· π» Code Β· π
pip install miro-t2i
![]() |
Layout
Every variant lives in its own subfolder and is loaded via the variant= argument:
from miro import MiroPipeline
import torch
pipe = MiroPipeline.from_pretrained(
"nicolas-dufour/miro-ablations",
variant="miro-no-clip", # β the subfolder name
).to("cuda", torch.float16)
Each MiroPipeline instance exposes pipe.coherence_keys, which lists the reward axes the loaded checkpoint was trained on. reward_targets={...} will raise ValueError if you pass a key that's not in this list.
Variants
Reward ablations (8) β full MIRO recipe minus one signal
Same architecture and training data as the main MIRO, with one reward signal turned off so you can isolate its contribution.
| Subfolder | What's ablated | coherence_keys size |
|---|---|---|
miro-no-synthetic-captions |
Trained on original captions only (no synthetic-caption augmentation) | 7 |
miro-no-aesthetic |
LAION aesthetic-quality reward | 6 |
miro-no-clip |
CLIP text-image alignment | 6 |
miro-no-hpsv2 |
HPSv2 human preference | 6 |
miro-no-image-reward |
ImageReward | 6 |
miro-no-pickscore |
PickScore human preference | 6 |
miro-no-sciscore |
SciScore | 6 |
miro-no-vqa |
VQAScore | 6 |
Single-reward specialists (7) β paper baselines
Each is trained on only one reward signal β the controls the paper compares MIRO against. pipe.coherence_keys is a 1-tuple for these.
| Subfolder | The one reward it knows about |
|---|---|
miro-only-aesthetic |
aesthetic_score |
miro-only-clip |
clip_score |
miro-only-hpsv2 |
hpsv2_score |
miro-only-image-reward |
image_reward_score |
miro-only-pickscore |
pick_a_score_score |
miro-only-sciscore |
sciscore_score |
miro-only-vqa |
vqa_score |
What's in each subfolder
miro-<variant>/
βββ model.safetensors # fp32 EMA weights (~1.4 GB) β ready for finetuning
βββ config.json # network kwargs + sampler defaults
βββ uncond_embedding.npy # precomputed FLAN-T5-XL unconditional embedding
βββ teaser.jpg # shared masonry gallery
βββ README.md # per-variant model card
Citation
@inproceedings{dufour2026miro,
title = {{MIRO}: {M}ult{I}-{R}eward c{O}nditioned pretraining improves {T2I} quality and efficiency},
author = {Dufour, Nicolas and Degeorge, Lucas and Ghosh, Arijit and Kalogeiton, Vicky and Picard, David},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2026}
}
License
MIT β see LICENSE.
