miro-ablations / README.md
nicolas-dufour's picture
Add root model card
3678ae1 verified
metadata
license: mit
library_name: miro-t2i
tags:
  - text-to-image
  - diffusion
  - flow-matching
  - miro
  - reward-conditioning
  - ablations
pipeline_tag: text-to-image

MIRO β€” ablations and single-reward specialists

This repository hosts the 15 ablation / baseline checkpoints that accompany the main MIRO release at nicolas-dufour/miro.

Dufour, Degeorge, Ghosh, Kalogeiton, Picard. MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency. ICML 2026.

πŸ“„ Paper Β· 🌐 Project page Β· πŸ’» Code Β· 🐍 pip install miro-t2i

MIRO samples

Layout

Every variant lives in its own subfolder and is loaded via the variant= argument:

from miro import MiroPipeline
import torch

pipe = MiroPipeline.from_pretrained(
    "nicolas-dufour/miro-ablations",
    variant="miro-no-clip",            # ← the subfolder name
).to("cuda", torch.float16)

Each MiroPipeline instance exposes pipe.coherence_keys, which lists the reward axes the loaded checkpoint was trained on. reward_targets={...} will raise ValueError if you pass a key that's not in this list.

Variants

Reward ablations (8) β€” full MIRO recipe minus one signal

Same architecture and training data as the main MIRO, with one reward signal turned off so you can isolate its contribution.

Subfolder What's ablated coherence_keys size
miro-no-synthetic-captions Trained on original captions only (no synthetic-caption augmentation) 7
miro-no-aesthetic LAION aesthetic-quality reward 6
miro-no-clip CLIP text-image alignment 6
miro-no-hpsv2 HPSv2 human preference 6
miro-no-image-reward ImageReward 6
miro-no-pickscore PickScore human preference 6
miro-no-sciscore SciScore 6
miro-no-vqa VQAScore 6

Single-reward specialists (7) β€” paper baselines

Each is trained on only one reward signal β€” the controls the paper compares MIRO against. pipe.coherence_keys is a 1-tuple for these.

Subfolder The one reward it knows about
miro-only-aesthetic aesthetic_score
miro-only-clip clip_score
miro-only-hpsv2 hpsv2_score
miro-only-image-reward image_reward_score
miro-only-pickscore pick_a_score_score
miro-only-sciscore sciscore_score
miro-only-vqa vqa_score

What's in each subfolder

miro-<variant>/
β”œβ”€β”€ model.safetensors      # fp32 EMA weights (~1.4 GB) β€” ready for finetuning
β”œβ”€β”€ config.json            # network kwargs + sampler defaults
β”œβ”€β”€ uncond_embedding.npy   # precomputed FLAN-T5-XL unconditional embedding
β”œβ”€β”€ teaser.jpg             # shared masonry gallery
└── README.md              # per-variant model card

Citation

@inproceedings{dufour2026miro,
  title     = {{MIRO}: {M}ult{I}-{R}eward c{O}nditioned pretraining improves {T2I} quality and efficiency},
  author    = {Dufour, Nicolas and Degeorge, Lucas and Ghosh, Arijit and Kalogeiton, Vicky and Picard, David},
  booktitle = {International Conference on Machine Learning (ICML)},
  year      = {2026}
}

License

MIT β€” see LICENSE.