Update model card with metadata, paper links and usage
Browse filesHi! I'm Niels from the Hugging Face community team. I've opened this PR to improve the documentation of your model.
Specifically, I have:
- Added `pipeline_tag: text-to-image` to the metadata to help users find your model.
- Added `library_name: diffusers` to enable automated code snippets on the Hub.
- Included links to the research paper, project page, and GitHub repository.
- Preserved the usage example from your README to ensure users can easily run the model.
These changes will make the model much more discoverable and easier to use within the Hugging Face ecosystem.
README.md
CHANGED
|
@@ -1,11 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
|
| 2 |
-
|
|
|
|
| 3 |
<a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a>  
|
| 4 |
<a href="https://arxiv.org/abs/2603.07700"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a>  
|
|
|
|
| 5 |
</div>
|
| 6 |
|
| 7 |
-
|
| 8 |
-
This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/2603.07700)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
|
| 9 |
|
| 10 |
<div align="center">
|
| 11 |
<img src="teaser_git.png" width="100%">
|
|
@@ -15,6 +21,8 @@ This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Mod
|
|
| 15 |
Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
|
| 16 |
</p>
|
| 17 |
|
|
|
|
|
|
|
| 18 |
|
| 19 |
## Pre-trained Model
|
| 20 |
|
|
@@ -22,12 +30,15 @@ This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Mod
|
|
| 22 |
|
| 23 |
## Usage
|
| 24 |
|
|
|
|
|
|
|
| 25 |
```python
|
| 26 |
import os
|
| 27 |
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
|
| 28 |
import torch
|
| 29 |
from diffusers import ZImagePipeline
|
| 30 |
from peft import LoraConfig, get_peft_model
|
|
|
|
| 31 |
def load_ema(pipeline, lora_path, adapter_name='default'):
|
| 32 |
"""Load EMA weights into the pipeline's transformer adapter"""
|
| 33 |
pipeline.transformer.set_adapter(adapter_name)
|
|
@@ -42,6 +53,7 @@ def load_ema(pipeline, lora_path, adapter_name='default'):
|
|
| 42 |
for param, ema_param in zip(trainable_params, ema_params):
|
| 43 |
param.data.copy_(ema_param.to(param.device))
|
| 44 |
print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
|
|
|
|
| 45 |
pipeline = ZImagePipeline.from_pretrained(
|
| 46 |
"Tongyi-MAI/Z-Image-Turbo",
|
| 47 |
torch_dtype=torch.bfloat16,
|
|
@@ -58,6 +70,7 @@ pipeline.transformer = get_peft_model(
|
|
| 58 |
transformer_lora_config,
|
| 59 |
adapter_name="tdmr1",
|
| 60 |
)
|
|
|
|
| 61 |
load_ema(
|
| 62 |
pipeline,
|
| 63 |
lora_path="./tdmr1_zimage_ema.ckpt",
|
|
@@ -65,12 +78,12 @@ load_ema(
|
|
| 65 |
)
|
| 66 |
pipeline = pipeline.to("cuda")
|
| 67 |
image = pipeline(
|
| 68 |
-
prompt=
|
| 69 |
height=1024,
|
| 70 |
width=1024,
|
| 71 |
num_inference_steps=5, # This actually results in 4 DiT forwards
|
| 72 |
guidance_scale=0.0,
|
| 73 |
-
generator=torch.Generator("cuda").manual_seed(
|
| 74 |
).images[0]
|
| 75 |
image
|
| 76 |
```
|
|
@@ -80,15 +93,13 @@ image
|
|
| 80 |
Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
|
| 81 |
|
| 82 |
## Bibtex
|
| 83 |
-
```
|
| 84 |
@misc{luo2025tdmr1,
|
| 85 |
title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
|
| 86 |
author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
|
| 87 |
year={2025},
|
| 88 |
-
eprint={
|
| 89 |
archivePrefix={arXiv},
|
| 90 |
primaryClass={cs.CV}
|
| 91 |
}
|
| 92 |
-
```
|
| 93 |
-
|
| 94 |
-
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: diffusers
|
| 3 |
+
pipeline_tag: text-to-image
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
# TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
|
| 7 |
+
|
| 8 |
+
<div align="center Lark">
|
| 9 |
<a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a>  
|
| 10 |
<a href="https://arxiv.org/abs/2603.07700"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a>  
|
| 11 |
+
<a href="https://github.com/Luo-Yihong/TDM-R1"><img src="https://img.shields.io/static/v1?label=Code&message=Github&color=green&logo=github"></a>
|
| 12 |
</div>
|
| 13 |
|
| 14 |
+
This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/2603.07700)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
|
|
|
|
| 15 |
|
| 16 |
<div align="center">
|
| 17 |
<img src="teaser_git.png" width="100%">
|
|
|
|
| 21 |
Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
|
| 22 |
</p>
|
| 23 |
|
| 24 |
+
## Description
|
| 25 |
+
TDM-R1 is a reinforcement learning (RL) paradigm for few-step generative models. It decouples the learning process into surrogate reward learning and generator learning, allowing for the use of non-differentiable rewards (e.g., human preference, object counts). This repository contains the reinforced version of the [Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) model.
|
| 26 |
|
| 27 |
## Pre-trained Model
|
| 28 |
|
|
|
|
| 30 |
|
| 31 |
## Usage
|
| 32 |
|
| 33 |
+
You can use this model with `diffusers` and `peft`. Below is an example of how to load the weights as a LoRA adapter.
|
| 34 |
+
|
| 35 |
```python
|
| 36 |
import os
|
| 37 |
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
|
| 38 |
import torch
|
| 39 |
from diffusers import ZImagePipeline
|
| 40 |
from peft import LoraConfig, get_peft_model
|
| 41 |
+
|
| 42 |
def load_ema(pipeline, lora_path, adapter_name='default'):
|
| 43 |
"""Load EMA weights into the pipeline's transformer adapter"""
|
| 44 |
pipeline.transformer.set_adapter(adapter_name)
|
|
|
|
| 53 |
for param, ema_param in zip(trainable_params, ema_params):
|
| 54 |
param.data.copy_(ema_param.to(param.device))
|
| 55 |
print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
|
| 56 |
+
|
| 57 |
pipeline = ZImagePipeline.from_pretrained(
|
| 58 |
"Tongyi-MAI/Z-Image-Turbo",
|
| 59 |
torch_dtype=torch.bfloat16,
|
|
|
|
| 70 |
transformer_lora_config,
|
| 71 |
adapter_name="tdmr1",
|
| 72 |
)
|
| 73 |
+
# Ensure the checkpoint file is downloaded locally
|
| 74 |
load_ema(
|
| 75 |
pipeline,
|
| 76 |
lora_path="./tdmr1_zimage_ema.ckpt",
|
|
|
|
| 78 |
)
|
| 79 |
pipeline = pipeline.to("cuda")
|
| 80 |
image = pipeline(
|
| 81 |
+
prompt="A high quality photo of a cat",
|
| 82 |
height=1024,
|
| 83 |
width=1024,
|
| 84 |
num_inference_steps=5, # This actually results in 4 DiT forwards
|
| 85 |
guidance_scale=0.0,
|
| 86 |
+
generator=torch.Generator("cuda").manual_seed(42),
|
| 87 |
).images[0]
|
| 88 |
image
|
| 89 |
```
|
|
|
|
| 93 |
Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
|
| 94 |
|
| 95 |
## Bibtex
|
| 96 |
+
```bibtex
|
| 97 |
@misc{luo2025tdmr1,
|
| 98 |
title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
|
| 99 |
author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
|
| 100 |
year={2025},
|
| 101 |
+
eprint={2603.07700},
|
| 102 |
archivePrefix={arXiv},
|
| 103 |
primaryClass={cs.CV}
|
| 104 |
}
|
| 105 |
+
```
|
|
|
|
|
|