Improve model card with metadata, links, and usage for Transition Models (TiM)
Browse filesThis PR significantly enhances the model card for the Transition Models (TiM) by:
- Adding the `license` to the metadata (Apache-2.0).
- Including the `pipeline_tag: text-to-image` to ensure discoverability on the Hugging Face Hub.
- Providing a concise description of the model based on the paper's abstract and highlights from the GitHub README.
- Including a "Quickstart" section with instructions for setting up the environment, downloading the model, and running text-to-image generation, directly extracted from the GitHub README to facilitate immediate usage.
README.md
CHANGED
|
@@ -1 +1,58 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-to-image
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# Transition Models: Rethinking the Generative Learning Objective
|
| 7 |
+
|
| 8 |
+
This repository contains the Transition Models (TiM) presented in the paper [Transition Models: Rethinking the Generative Learning Objective](https://arxiv.org/abs/2509.04394).
|
| 9 |
+
|
| 10 |
+
TiM is a novel generative model designed for flexible photorealistic text-to-image generation. It achieves state-of-the-art performance with high efficiency by learning arbitrary state-to-state transitions, unifying few-step and many-step generation within a single model.
|
| 11 |
+
|
| 12 |
+
* **Paper**: [https://arxiv.org/abs/2509.04394](https://arxiv.org/abs/2509.04394)
|
| 13 |
+
* **Code**: [https://github.com/WZDTHU/TiM](https://github.com/WZDTHU/TiM)
|
| 14 |
+
|
| 15 |
+
## Highlights
|
| 16 |
+
|
| 17 |
+
* Our Transition Models (TiM) are trained to master arbitrary state-to-state transitions. This approach allows TiM to learn the entire solution manifold of the generative process, unifying the few-step and many-step regimes within a single, powerful model.
|
| 18 |
+

|
| 19 |
+
* Despite having only 865M parameters, TiM achieves state-to-art performance, surpassing leading models such as SD3.5 (8B parameters) and FLUX.1 (12B parameters) across all evaluated step counts on GenEval benchmark. Importantly, unlike previous few-step generators, TiM demonstrates monotonic quality improvement as the sampling budget increases.
|
| 20 |
+

|
| 21 |
+
* Additionally, when employing our native-resolution strategy, TiM delivers exceptional fidelity at resolutions up to `4096x4096`.
|
| 22 |
+

|
| 23 |
+
|
| 24 |
+
## Quickstart
|
| 25 |
+
|
| 26 |
+
### 1. Setup
|
| 27 |
+
|
| 28 |
+
First, clone the repo:
|
| 29 |
+
```bash
|
| 30 |
+
git clone https://github.com/WZDTHU/TiM.git && cd TiM
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
#### 1.1 Environment Setup
|
| 34 |
+
|
| 35 |
+
```bash
|
| 36 |
+
conda create -n tim_env python=3.10
|
| 37 |
+
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118
|
| 38 |
+
pip install flash-attn
|
| 39 |
+
pip install -r requirements.txt
|
| 40 |
+
pip install -e .
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
#### 1.2 Model Download
|
| 44 |
+
|
| 45 |
+
Download the Text-to-Image model:
|
| 46 |
+
```bash
|
| 47 |
+
mkdir checkpoints
|
| 48 |
+
wget -c "https://huggingface.co/GoodEnough/TiM-T2I/resolve/main/t2i_model.bin" -O checkpoints/t2i_model.bin
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
### 2. Sampling (Text-to-Image Generation)
|
| 52 |
+
|
| 53 |
+
We provide the sampling scripts on three benchmarks. You can specify the sampling steps, resolutions, and CFG scale in the corresponding scripts.
|
| 54 |
+
|
| 55 |
+
Sampling with TiM-T2I model on GenEval benchmark:
|
| 56 |
+
```bash
|
| 57 |
+
bash scripts/sample/t2i/sample_t2i_geneval.sh
|
| 58 |
+
```
|