nicolas-dufour
/

miro-ablations

reward-conditioning

Model card Files Files and versions

miro-ablations / README.md

nicolas-dufour's picture

Add root model card

3678ae1 verified 13 days ago

|

history blame contribute delete

3.62 kB

	---
	license: mit
	library_name: miro-t2i
	tags:
	- text-to-image
	- diffusion
	- flow-matching
	- miro
	- reward-conditioning
	- ablations
	pipeline_tag: text-to-image
	---

	# MIRO — ablations and single-reward specialists

	This repository hosts the 15 ablation / baseline checkpoints that accompany the main MIRO release at [`nicolas-dufour/miro`](https://huggingface.co/nicolas-dufour/miro).

	> Dufour, Degeorge, Ghosh, Kalogeiton, Picard. _MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency_. ICML 2026.
	>
	> 📄 [Paper](https://arxiv.org/abs/2510.25897) · 🌐 [Project page](https://nicolas-dufour.github.io/miro/) · 💻 [Code](https://github.com/nicolas-dufour/miro) · 🐍 `pip install miro-t2i`

	<table style="width:100%;border-collapse:separate;border-spacing:4px">
	<tr><td><img src="https://huggingface.co/nicolas-dufour/miro/resolve/main/teaser.jpg" alt="MIRO samples"></td></tr>
	</table>

	## Layout

	Every variant lives in its own subfolder and is loaded via the `variant=` argument:

	```python
	from miro import MiroPipeline
	import torch

	pipe = MiroPipeline.from_pretrained(
	"nicolas-dufour/miro-ablations",
	variant="miro-no-clip", # ← the subfolder name
	).to("cuda", torch.float16)
	```

	Each `MiroPipeline` instance exposes `pipe.coherence_keys`, which lists the reward axes the loaded checkpoint was trained on. `reward_targets={...}` will raise `ValueError` if you pass a key that's not in this list.

	## Variants

	### Reward ablations (8) — full MIRO recipe minus one signal

	Same architecture and training data as the main MIRO, with one reward signal turned off so you can isolate its contribution.

	\| Subfolder \| What's ablated \| `coherence_keys` size \|
	\|---\|---\|:-:\|
	\| `miro-no-synthetic-captions` \| Trained on original captions only (no synthetic-caption augmentation) \| 7 \|
	\| `miro-no-aesthetic` \| LAION aesthetic-quality reward \| 6 \|
	\| `miro-no-clip` \| CLIP text-image alignment \| 6 \|
	\| `miro-no-hpsv2` \| HPSv2 human preference \| 6 \|
	\| `miro-no-image-reward` \| ImageReward \| 6 \|
	\| `miro-no-pickscore` \| PickScore human preference \| 6 \|
	\| `miro-no-sciscore` \| SciScore \| 6 \|
	\| `miro-no-vqa` \| VQAScore \| 6 \|

	### Single-reward specialists (7) — paper baselines

	Each is trained on only one reward signal — the controls the paper compares MIRO against. `pipe.coherence_keys` is a 1-tuple for these.

	\| Subfolder \| The one reward it knows about \|
	\|---\|---\|
	\| `miro-only-aesthetic` \| `aesthetic_score` \|
	\| `miro-only-clip` \| `clip_score` \|
	\| `miro-only-hpsv2` \| `hpsv2_score` \|
	\| `miro-only-image-reward` \| `image_reward_score` \|
	\| `miro-only-pickscore` \| `pick_a_score_score` \|
	\| `miro-only-sciscore` \| `sciscore_score` \|
	\| `miro-only-vqa` \| `vqa_score` \|

	## What's in each subfolder

	```
	miro-<variant>/
	├── model.safetensors # fp32 EMA weights (~1.4 GB) — ready for finetuning
	├── config.json # network kwargs + sampler defaults
	├── uncond_embedding.npy # precomputed FLAN-T5-XL unconditional embedding
	├── teaser.jpg # shared masonry gallery
	└── README.md # per-variant model card
	```

	## Citation

	```bibtex
	@inproceedings{dufour2026miro,
	title = {{MIRO}: {M}ult{I}-{R}eward c{O}nditioned pretraining improves {T2I} quality and efficiency},
	author = {Dufour, Nicolas and Degeorge, Lucas and Ghosh, Arijit and Kalogeiton, Vicky and Picard, David},
	booktitle = {International Conference on Machine Learning (ICML)},
	year = {2026}
	}
	```

	## License

	MIT — see [LICENSE](https://github.com/nicolas-dufour/miro/blob/main/LICENSE).