Update model card with metadata, paper links and usage

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +21 -10
README.md CHANGED
@@ -1,11 +1,17 @@
 
 
 
 
 
1
  # TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
2
- <div align="center">
 
3
  <a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a> &ensp;
4
  <a href="https://arxiv.org/abs/2603.07700"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a> &ensp;
 
5
  </div>
6
 
7
-
8
- This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/2603.07700)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
9
 
10
  <div align="center">
11
  <img src="teaser_git.png" width="100%">
@@ -15,6 +21,8 @@ This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Mod
15
  Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
16
  </p>
17
 
 
 
18
 
19
  ## Pre-trained Model
20
 
@@ -22,12 +30,15 @@ This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Mod
22
 
23
  ## Usage
24
 
 
 
25
  ```python
26
  import os
27
  os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
28
  import torch
29
  from diffusers import ZImagePipeline
30
  from peft import LoraConfig, get_peft_model
 
31
  def load_ema(pipeline, lora_path, adapter_name='default'):
32
  """Load EMA weights into the pipeline's transformer adapter"""
33
  pipeline.transformer.set_adapter(adapter_name)
@@ -42,6 +53,7 @@ def load_ema(pipeline, lora_path, adapter_name='default'):
42
  for param, ema_param in zip(trainable_params, ema_params):
43
  param.data.copy_(ema_param.to(param.device))
44
  print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
 
45
  pipeline = ZImagePipeline.from_pretrained(
46
  "Tongyi-MAI/Z-Image-Turbo",
47
  torch_dtype=torch.bfloat16,
@@ -58,6 +70,7 @@ pipeline.transformer = get_peft_model(
58
  transformer_lora_config,
59
  adapter_name="tdmr1",
60
  )
 
61
  load_ema(
62
  pipeline,
63
  lora_path="./tdmr1_zimage_ema.ckpt",
@@ -65,12 +78,12 @@ load_ema(
65
  )
66
  pipeline = pipeline.to("cuda")
67
  image = pipeline(
68
- prompt=prompt,
69
  height=1024,
70
  width=1024,
71
  num_inference_steps=5, # This actually results in 4 DiT forwards
72
  guidance_scale=0.0,
73
- generator=torch.Generator("cuda").manual_seed(xxx),
74
  ).images[0]
75
  image
76
  ```
@@ -80,15 +93,13 @@ image
80
  Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
81
 
82
  ## Bibtex
83
- ```
84
  @misc{luo2025tdmr1,
85
  title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
86
  author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
87
  year={2025},
88
- eprint={TODO},
89
  archivePrefix={arXiv},
90
  primaryClass={cs.CV}
91
  }
92
- ```
93
-
94
-
 
1
+ ---
2
+ library_name: diffusers
3
+ pipeline_tag: text-to-image
4
+ ---
5
+
6
  # TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
7
+
8
+ <div align="center Lark">
9
  <a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a> &ensp;
10
  <a href="https://arxiv.org/abs/2603.07700"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a> &ensp;
11
+ <a href="https://github.com/Luo-Yihong/TDM-R1"><img src="https://img.shields.io/static/v1?label=Code&message=Github&color=green&logo=github"></a>
12
  </div>
13
 
14
+ This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/2603.07700)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
 
15
 
16
  <div align="center">
17
  <img src="teaser_git.png" width="100%">
 
21
  Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
22
  </p>
23
 
24
+ ## Description
25
+ TDM-R1 is a reinforcement learning (RL) paradigm for few-step generative models. It decouples the learning process into surrogate reward learning and generator learning, allowing for the use of non-differentiable rewards (e.g., human preference, object counts). This repository contains the reinforced version of the [Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) model.
26
 
27
  ## Pre-trained Model
28
 
 
30
 
31
  ## Usage
32
 
33
+ You can use this model with `diffusers` and `peft`. Below is an example of how to load the weights as a LoRA adapter.
34
+
35
  ```python
36
  import os
37
  os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
38
  import torch
39
  from diffusers import ZImagePipeline
40
  from peft import LoraConfig, get_peft_model
41
+
42
  def load_ema(pipeline, lora_path, adapter_name='default'):
43
  """Load EMA weights into the pipeline's transformer adapter"""
44
  pipeline.transformer.set_adapter(adapter_name)
 
53
  for param, ema_param in zip(trainable_params, ema_params):
54
  param.data.copy_(ema_param.to(param.device))
55
  print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
56
+
57
  pipeline = ZImagePipeline.from_pretrained(
58
  "Tongyi-MAI/Z-Image-Turbo",
59
  torch_dtype=torch.bfloat16,
 
70
  transformer_lora_config,
71
  adapter_name="tdmr1",
72
  )
73
+ # Ensure the checkpoint file is downloaded locally
74
  load_ema(
75
  pipeline,
76
  lora_path="./tdmr1_zimage_ema.ckpt",
 
78
  )
79
  pipeline = pipeline.to("cuda")
80
  image = pipeline(
81
+ prompt="A high quality photo of a cat",
82
  height=1024,
83
  width=1024,
84
  num_inference_steps=5, # This actually results in 4 DiT forwards
85
  guidance_scale=0.0,
86
+ generator=torch.Generator("cuda").manual_seed(42),
87
  ).images[0]
88
  image
89
  ```
 
93
  Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
94
 
95
  ## Bibtex
96
+ ```bibtex
97
  @misc{luo2025tdmr1,
98
  title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
99
  author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
100
  year={2025},
101
+ eprint={2603.07700},
102
  archivePrefix={arXiv},
103
  primaryClass={cs.CV}
104
  }
105
+ ```