Luo-Yihong
/

TDM-R1

Model card Files Files and versions

xet

Community

Update model card with metadata, paper links and usage

by nielsr HF Staff - opened Mar 10

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+21

-10

Files changed (1) hide show

README.md +21 -10

README.md CHANGED Viewed

@@ -1,11 +1,17 @@
 # TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
-<div align="center">
   <a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a> &ensp;
   <a href="https://arxiv.org/abs/2603.07700"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a> &ensp;
 </div>
-This is the Official Repository of  "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/2603.07700)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
 <div align="center">
   <img src="teaser_git.png" width="100%">
@@ -15,6 +21,8 @@ This is the Official Repository of  "[TDM-R1: Reinforcing Few-Step Diffusion Mod
   Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
 </p>
 ## Pre-trained Model
@@ -22,12 +30,15 @@ This is the Official Repository of  "[TDM-R1: Reinforcing Few-Step Diffusion Mod
 ## Usage
 ```python
 import os
 os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
 import torch
 from diffusers import ZImagePipeline
 from peft import LoraConfig, get_peft_model
 def load_ema(pipeline, lora_path, adapter_name='default'):
     """Load EMA weights into the pipeline's transformer adapter"""
     pipeline.transformer.set_adapter(adapter_name)
@@ -42,6 +53,7 @@ def load_ema(pipeline, lora_path, adapter_name='default'):
     for param, ema_param in zip(trainable_params, ema_params):
         param.data.copy_(ema_param.to(param.device))
     print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
 pipeline = ZImagePipeline.from_pretrained(
     "Tongyi-MAI/Z-Image-Turbo",
     torch_dtype=torch.bfloat16,
@@ -58,6 +70,7 @@ pipeline.transformer = get_peft_model(
     transformer_lora_config,
     adapter_name="tdmr1",
 )
 load_ema(
     pipeline,
     lora_path="./tdmr1_zimage_ema.ckpt",
@@ -65,12 +78,12 @@ load_ema(
 )
 pipeline = pipeline.to("cuda")
 image = pipeline(
-      prompt=prompt,
       height=1024,
       width=1024,
       num_inference_steps=5,  # This actually results in 4 DiT forwards
       guidance_scale=0.0,
-      generator=torch.Generator("cuda").manual_seed(xxx),
   ).images[0]
 image
 ```
@@ -80,15 +93,13 @@ image
 Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
 ## Bibtex
-```
 @misc{luo2025tdmr1,
   title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
   author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
   year={2025},
-  eprint={TODO},
   archivePrefix={arXiv},
   primaryClass={cs.CV}
 }
-```

+---
+library_name: diffusers
+pipeline_tag: text-to-image
+---
 # TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
+<div align="center Lark">
   <a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a> &ensp;
   <a href="https://arxiv.org/abs/2603.07700"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a> &ensp;
+  <a href="https://github.com/Luo-Yihong/TDM-R1"><img src="https://img.shields.io/static/v1?label=Code&message=Github&color=green&logo=github"></a>
 </div>
+This is the Official Repository of "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/2603.07700)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
 <div align="center">
   <img src="teaser_git.png" width="100%">
   Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
 </p>
+## Description
+TDM-R1 is a reinforcement learning (RL) paradigm for few-step generative models. It decouples the learning process into surrogate reward learning and generator learning, allowing for the use of non-differentiable rewards (e.g., human preference, object counts). This repository contains the reinforced version of the [Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) model.
 ## Pre-trained Model
 ## Usage
+You can use this model with `diffusers` and `peft`. Below is an example of how to load the weights as a LoRA adapter.
 ```python
 import os
 os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
 import torch
 from diffusers import ZImagePipeline
 from peft import LoraConfig, get_peft_model
 def load_ema(pipeline, lora_path, adapter_name='default'):
     """Load EMA weights into the pipeline's transformer adapter"""
     pipeline.transformer.set_adapter(adapter_name)
     for param, ema_param in zip(trainable_params, ema_params):
         param.data.copy_(ema_param.to(param.device))
     print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
 pipeline = ZImagePipeline.from_pretrained(
     "Tongyi-MAI/Z-Image-Turbo",
     torch_dtype=torch.bfloat16,
     transformer_lora_config,
     adapter_name="tdmr1",
 )
+# Ensure the checkpoint file is downloaded locally
 load_ema(
     pipeline,
     lora_path="./tdmr1_zimage_ema.ckpt",
 )
 pipeline = pipeline.to("cuda")
 image = pipeline(
+      prompt="A high quality photo of a cat",
       height=1024,
       width=1024,
       num_inference_steps=5,  # This actually results in 4 DiT forwards
       guidance_scale=0.0,
+      generator=torch.Generator("cuda").manual_seed(42),
   ).images[0]
 image
 ```
 Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
 ## Bibtex
+```bibtex
 @misc{luo2025tdmr1,
   title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
   author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
   year={2025},
+  eprint={2603.07700},
   archivePrefix={arXiv},
   primaryClass={cs.CV}
 }
+```