Improve model card: Add pipeline tag, library name, and consistent paper/Github links

This PR improves the model card for HuMo by:

- Adding `pipeline_tag: any-to-any` to help users discover the model for its broad multimodal video generation capabilities.
- Adding `library_name: diffusers` as evidence from `config.json` (`_diffusers_version`) indicates compatibility with the Diffusers library, enabling automated code snippets.
- Updating the arXiv badge and citation links to `https://arxiv.org/abs/2509.08519` for consistency with the paper's actual ID and the BibTeX.
- Adding a badge link to the GitHub repository (`https://github.com/Phantom-video/HuMo`) in the header for easy access to the code.

Files changed (1) hide show

README.md +16 -7

README.md CHANGED Viewed

@@ -1,9 +1,15 @@
 # HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
 <div align="center">
-[![arXiv](https://img.shields.io/badge/arXiv%20paper-2502.11079-b31b1b.svg)](https://arxiv.org/abs/xxx)&nbsp;
 [![project page](https://img.shields.io/badge/Project_page-More_visualizations-green)](https://phantom-video.github.io/HuMo/)&nbsp;
 <a href="https://huggingface.co/bytedance-research/HuMo"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=orange"></a>
 </div>
@@ -111,15 +117,18 @@ Our work builds upon and is greatly inspired by several outstanding open-source
 If HuMo is helpful, please help to ⭐ the repo.
-If you find this project useful for your research, please consider citing our [paper](https://arxiv.org/abs/2502.11079).
 ### BibTeX
 ```bibtex
-@article{chen2025HuMo,
-  title={Human-Centric Video Generation via Collaborative Multi-Modal Conditioning},
-  author={Chen, Liyang and Ma, Tianxiang and Liu, Jiawei and Li, Bingchuan and Chen, Zhuowei and Liu, Lijie and He, Xu and Li, Gen and He, Qian and Wu, Zhiyong},
-  journal={arXiv preprint arXiv:2502.11079},
-  year={2025}
 }
 ```

+---
+pipeline_tag: any-to-any
+library_name: diffusers
+---
 # HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
 <div align="center">
+[![arXiv](https://img.shields.io/badge/arXiv%20paper-2509.08519-b31b1b.svg)](https://arxiv.org/abs/2509.08519)&nbsp;
 [![project page](https://img.shields.io/badge/Project_page-More_visualizations-green)](https://phantom-video.github.io/HuMo/)&nbsp;
+[![GitHub](https://img.shields.io/badge/GitHub-Code-blue.svg)](https://github.com/Phantom-video/HuMo)&nbsp;
 <a href="https://huggingface.co/bytedance-research/HuMo"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=orange"></a>
 </div>
 If HuMo is helpful, please help to ⭐ the repo.
+If you find this project useful for your research, please consider citing our [paper](https://arxiv.org/abs/2509.08519).
 ### BibTeX
 ```bibtex
+@misc{chen2025humo,
+      title={HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning},
+      author={Liyang Chen and Tianxiang Ma and Jiawei Liu and Bingchuan Li and Zhuowei Chen and Lijie Liu and Xu He and Gen Li and Qian He and Zhiyong Wu},
+      year={2025},
+      eprint={2509.08519},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2509.08519},
 }
 ```