Luo-Yihong
/

TDM-R1

Model card Files Files and versions

xet

Community

Luo-Yihong commited on 4 days ago

Commit

3af899a

verified ·

1 Parent(s): 6286765

Create README.md

Browse files

Files changed (1) hide show

README.md +94 -0

README.md ADDED Viewed

	@@ -0,0 +1,94 @@

+# TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
+<div align="center">
+  <a href="https://luo-yihong.github.io/TDM-R1-Page/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a> &ensp;
+  <a href="https://arxiv.org/abs/xxx"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv:TDM-R1&color=red&logo=arxiv"></a> &ensp;
+</div>
+This is the Official Repository of  "[TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward](https://arxiv.org/abs/xxx)", by *Yihong Luo, Tianyang Hu, Weijian Luo, Jing Tang*.
+<div align="center">
+  <img src="teaser_git.png" width="100%">
+</div>
+<p align="center">
+  Samples generated by <b>TDM-R1</b> using only <b>4 NFEs</b>, obtained by reinforcing the recent powerful Z-Image model.
+</p>
+## Pre-trained Model
+- [TDM-R1-ZImage](https://huggingface.co/Luo-Yihong/TDM-R1)
+## Usage
+```python
+import os
+os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
+import torch
+from diffusers import ZImagePipeline
+from peft import LoraConfig, get_peft_model
+def load_ema(pipeline, lora_path, adapter_name='default'):
+    """Load EMA weights into the pipeline's transformer adapter"""
+    pipeline.transformer.set_adapter(adapter_name)
+    trainable_params = [
+        p for n, p in pipeline.transformer.named_parameters()
+        if adapter_name in n and p.requires_grad
+    ]
+    state_dict = torch.load(lora_path, map_location=pipeline.transformer.device)
+    ema_params = state_dict["ema_parameters"]
+    assert len(trainable_params) == len(ema_params), \
+        f"Parameter count mismatch: {len(trainable_params)} vs {len(ema_params)}"
+    for param, ema_param in zip(trainable_params, ema_params):
+        param.data.copy_(ema_param.to(param.device))
+    print(f"Loaded EMA weights for adapter '{adapter_name}' from {lora_path}")
+pipeline = ZImagePipeline.from_pretrained(
+    "Tongyi-MAI/Z-Image-Turbo",
+    torch_dtype=torch.bfloat16,
+    low_cpu_mem_usage=False,
+)
+transformer_lora_config = LoraConfig(
+    r=32,
+    lora_alpha=64,
+    init_lora_weights="gaussian",
+    target_modules=["to_q", "to_k", "to_v", "to_out.0", "add_k_proj", "add_v_proj"],
+)
+pipeline.transformer = get_peft_model(
+    pipeline.transformer,
+    transformer_lora_config,
+    adapter_name="tdmr1",
+)
+load_ema(
+    pipeline,
+    lora_path="./tdmr1_zimage_ema.ckpt",
+    adapter_name="tdmr1",
+)
+pipeline = pipeline.to("cuda")
+image = pipeline(
+      prompt=prompt,
+      height=1024,
+      width=1024,
+      num_inference_steps=5,  # This actually results in 4 DiT forwards
+      guidance_scale=0.0,
+      generator=torch.Generator("cuda").manual_seed(xxx),
+  ).images[0]
+image
+```
+## Contact
+Please contact Yihong Luo (yluocg@connect.ust.hk) if you have any questions about this work.
+## Bibtex
+```
+@misc{luo2025tdmr1,
+  title={TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward},
+  author={Yihong Luo and Tianyang Hu and Weijian Luo and Jing Tang},
+  year={2025},
+  eprint={TODO},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV}
+}
+```