Update README.md

436ce49 verified 4 months ago

3.78 kB

	---
	base_model:
	- ByteDance-Seed/BAGEL-7B-MoT
	datasets:
	- jackyhate/text-to-image-2M
	language:
	- en
	- zh
	license: apache-2.0
	pipeline_tag: any-to-any
	library_name: transformers
	---

	# BAGEL-RecA

	🚀 Just 6 × 80GB A100s × 4.5 hours to boost BAGEL performance across all tasks! Outperforms FLUX-Kontext in image editing capabilities!

	> A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.

	## Paper
	[Reconstruction Alignment Improves Unified Multimodal Models](https://huggingface.co/papers/2509.07295)

	## Project Page
	https://reconstruction-alignment.github.io/

	## Code

	https://github.com/HorizonWind2004/reconstruction-alignment

	This repository hosts the model weights (NF4, INT8, BF16) for BAGEL-RecA. We fine-tuned BAGEL on 6 80GB NVIDIA A800 for only 27 GPU hours. While the understanding capability remains unchanged, our ReAlign method brings +3.6 zero-shot improvement on GenEval , +1.26 on DPGBench, +0.37 on ImgEdit and +0.33 on GEdit.

	For installation, usage instructions, and further documentation, please visit [our repo](https://github.com/HorizonWind2004/reconstruction-alignment
	) BAGEL's original [GitHub repo](https://github.com/bytedance-seed/BAGEL).

	[DF11 version of BAGEL-RecA](https://huggingface.co/theunlikely/BAGEL-RecA-DF11/tree/main), many thanks to @theunlikely !!!

	## 🧠 Method

	[![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/pdf/2509.07295)
	[![ArXiv](https://img.shields.io/badge/arXiv-A42C25?style=for-the-badge&logo=arxiv&logoColor=white&color=blue)](https://arxiv.org/abs/2509.07295)
	[![Github](https://img.shields.io/badge/RecA-000000?style=for-the-badge&logo=github&logoColor=000&logoColor=white)](https://github.com/HorizonWind2004/reconstruction-alignment)
	[![Hugging Face Collection](https://img.shields.io/badge/HF_Models-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/collections/sanaka87/realign-68ad2176380355a3dcedc068)
	[![HF Demo](https://img.shields.io/badge/Demo_(BAGEL)-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/spaces/sanaka87/BAGEL-ReAlign)
	[![Project Page](https://img.shields.io/badge/Project_Page-00CED1?style=for-the-badge&logo=web&logoColor=white)](https://reconstruction-alignment.github.io/)

	## 📊 Benchmarks

	### 1. Text-to-Image Generation

	We test it on 1024x1024 resolution.

	\| Model \| GenEval ↑ \| DPGBench ↑ \| WISE ↑ \|
	\| ------------ \| --------- \| --------- \| --------- \|
	\| BAGEL \| 0.787 \| 84.03 \| 0.50 \|
	\| BAGEL-RecA \| 0.824 \| 85.29 \| 0.52 \|

	### 2. Image Editing

	\| Model \| GEdit-Bench-EN (SC) ↑ \| GEdit-Bench-EN (PQ) ↑ \| GEdit-Bench-EN (O) ↑ \| ImgEdit ↑ \|
	\| ------------- \| --------------------- \| --------------------- \| ------------------- \| ------------------ \|
	\| BAGEL \| 7.96 \| 6.64 \| 6.94 \| 3.38 \|
	\| BAGEL-NHR \| 8.04 \| 6.87 \| 7.08 \| 3.48 \|
	\| BAGEL-RecA \| 8.24 \| 6.87 \| 7.27 \| 3.75 \|
	\| FLUX Kontext \| 6.95 \| 7.30 \| 6.27 \| 3.59 \|


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e99fc07e2ec711a7138262/lGur0scJWaCGkAwH2AHxy.png)

	## License

	BAGEL-RecA is licensed under the Apache 2.0 license.

	## ✍️ Citation

	If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation~


	```
	@article{xie2025reconstruction,
	title={Reconstruction Alignment Improves Unified Multimodal Models},
	author={Xie, Ji and Darrell, Trevor and Zettlemoyer, Luke and Wang, XuDong},
	journal={arXiv preprint arXiv:2509.07295},
	year={2025}
	}
	```