cszthuang
/

Local_DPO

Model card Files Files and versions

Improve model card

#1

by nielsr HF Staff - opened May 21

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +28 -3

README.md CHANGED Viewed

@@ -1,3 +1,28 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+pipeline_tag: text-to-video
+tags:
+- video-generation
+- dpo
+---
+# Mind the Generative Details: Direct Localized Detail Preference Optimization for Video Diffusion Models
+This repository contains the weights for **LocalDPO**, a novel post-training framework that constructs localized preference pairs from real videos and optimizes alignment at the spatio-temporal region level for video diffusion models.
+LocalDPO addresses the efficiency and ambiguity limitations of existing DPO methods. It treats high-quality real videos as positive samples and generates corresponding negatives by locally corrupting them with random spatio-temporal masks. Experiments on Wan2.1 and CogVideoX demonstrate that LocalDPO consistently improves video fidelity and temporal coherence.
+- **Paper:** [Mind the Generative Details: Direct Localized Detail Preference Optimization for Video Diffusion Models](https://huggingface.co/papers/2601.04068)
+- **Project Page:** [https://1170300714.github.io/LocalDPO/](https://1170300714.github.io/LocalDPO/)
+- **Code:** [https://github.com/1170300714/Local-DPO](https://github.com/1170300714/Local-DPO)
+## Citation
+```bibtex
+@article{huang2026mind,
+  title={Mind the Generative Details: Direct Localized Detail Preference Optimization for Video Diffusion Models},
+  author={Huang, Zitong and Zhang, Kaidong and Ding, Yukang and Gao, Chao and Ding, Rui and Chen, Ying and Zuo, Wangmeng},
+  journal={arXiv preprint arXiv:2601.04068},
+  year={2026}
+}
+```