|
|
--- |
|
|
license: apache-2.0 |
|
|
pipeline_tag: text-to-image |
|
|
--- |
|
|
|
|
|
# Transition Models: Rethinking the Generative Learning Objective |
|
|
|
|
|
This repository contains the Transition Models (TiM) presented in the paper [Transition Models: Rethinking the Generative Learning Objective](https://arxiv.org/abs/2509.04394). |
|
|
|
|
|
TiM is a novel generative model designed for flexible photorealistic text-to-image generation. It achieves state-of-the-art performance with high efficiency by learning arbitrary state-to-state transitions, unifying few-step and many-step generation within a single model. |
|
|
|
|
|
* **Paper**: [https://arxiv.org/abs/2509.04394](https://arxiv.org/abs/2509.04394) |
|
|
* **Code**: [https://github.com/WZDTHU/TiM](https://github.com/WZDTHU/TiM) |
|
|
|
|
|
## Highlights |
|
|
|
|
|
* Our Transition Models (TiM) are trained to master arbitrary state-to-state transitions. This approach allows TiM to learn the entire solution manifold of the generative process, unifying the few-step and many-step regimes within a single, powerful model. |
|
|
 |
|
|
* Despite having only 865M parameters, TiM achieves state-to-art performance, surpassing leading models such as SD3.5 (8B parameters) and FLUX.1 (12B parameters) across all evaluated step counts on GenEval benchmark. Importantly, unlike previous few-step generators, TiM demonstrates monotonic quality improvement as the sampling budget increases. |
|
|
 |
|
|
* Additionally, when employing our native-resolution strategy, TiM delivers exceptional fidelity at resolutions up to `4096x4096`. |
|
|
 |
|
|
|
|
|
## Quickstart |
|
|
|
|
|
### 1. Setup |
|
|
|
|
|
First, clone the repo: |
|
|
```bash |
|
|
git clone https://github.com/WZDTHU/TiM.git && cd TiM |
|
|
``` |
|
|
|
|
|
#### 1.1 Environment Setup |
|
|
|
|
|
```bash |
|
|
conda create -n tim_env python=3.10 |
|
|
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118 |
|
|
pip install flash-attn |
|
|
pip install -r requirements.txt |
|
|
pip install -e . |
|
|
``` |
|
|
|
|
|
#### 1.2 Model Download |
|
|
|
|
|
Download the Text-to-Image model: |
|
|
```bash |
|
|
mkdir checkpoints |
|
|
wget -c "https://huggingface.co/GoodEnough/TiM-T2I/resolve/main/t2i_model.bin" -O checkpoints/t2i_model.bin |
|
|
``` |
|
|
|
|
|
### 2. Sampling (Text-to-Image Generation) |
|
|
|
|
|
We provide the sampling scripts on three benchmarks. You can specify the sampling steps, resolutions, and CFG scale in the corresponding scripts. |
|
|
|
|
|
Sampling with TiM-T2I model on GenEval benchmark: |
|
|
```bash |
|
|
bash scripts/sample/t2i/sample_t2i_geneval.sh |
|
|
``` |