Improve model card: Remove non-standard metadata, shorten title, add citation and tags

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -1,19 +1,20 @@
1
  ---
2
  language:
3
  - en
4
- license: apache-2.0
5
- inference: false
6
  library_name: transformers
 
7
  pipeline_tag: text-generation
 
 
 
8
  ---
9
 
10
- <h1>VPO: Aligning Text-to-Video Generation Models with Prompt Optimization</h1>
11
 
12
  - **Repository:** https://github.com/thu-coai/VPO
13
  - **Paper:** [VPO: Aligning Text-to-Video Generation Models with Prompt Optimization](https://huggingface.co/papers/2503.20491)
14
  - **Data:** https://huggingface.co/datasets/CCCCCC/VPO
15
 
16
- # VPO
17
  VPO is a principled prompt optimization framework grounded in the principles of harmlessness, accuracy, and helpfulness.
18
  VPO employs a two-stage process that first constructs a supervised fine-tuning dataset guided by safety and alignment, and then conducts preference learning with both text-level and video-level feedback. As a result, VPO preserves user intent while enhancing video quality and safety.
19
 
@@ -82,9 +83,12 @@ print(resp)
82
  ```
83
  See our [Github Repo](https://github.com/thu-coai/VPO) for more detailed usage (e.g. Inference with Vllm).
84
 
85
-
86
- <!-- ## Citation
87
- If you find our model is useful in your work, please cite it with:
88
  ```
89
-
90
- ``` -->
 
 
 
 
 
 
1
  ---
2
  language:
3
  - en
 
 
4
  library_name: transformers
5
+ license: apache-2.0
6
  pipeline_tag: text-generation
7
+ tags:
8
+ - text-to-video
9
+ - prompt-optimization
10
  ---
11
 
12
+ <h1>VPO</h1>
13
 
14
  - **Repository:** https://github.com/thu-coai/VPO
15
  - **Paper:** [VPO: Aligning Text-to-Video Generation Models with Prompt Optimization](https://huggingface.co/papers/2503.20491)
16
  - **Data:** https://huggingface.co/datasets/CCCCCC/VPO
17
 
 
18
  VPO is a principled prompt optimization framework grounded in the principles of harmlessness, accuracy, and helpfulness.
19
  VPO employs a two-stage process that first constructs a supervised fine-tuning dataset guided by safety and alignment, and then conducts preference learning with both text-level and video-level feedback. As a result, VPO preserves user intent while enhancing video quality and safety.
20
 
 
83
  ```
84
  See our [Github Repo](https://github.com/thu-coai/VPO) for more detailed usage (e.g. Inference with Vllm).
85
 
86
+ ## Citation
 
 
87
  ```
88
+ @article{cheng2025vpo,
89
+ title={Vpo: Aligning text-to-video generation models with prompt optimization},
90
+ author={Cheng, Jiale and Lyu, Ruiliang and Gu, Xiaotao and Liu, Xiao and Xu, Jiazheng and Lu, Yida and Teng, Jiayan and Yang, Zhuoyi and Dong, Yuxiao and Tang, Jie and others},
91
+ journal={arXiv preprint arXiv:2503.20491},
92
+ year={2025}
93
+ }
94
+ ```