| --- |
| datasets: |
| - ASLP-lab/LyricEditBench |
| language: |
| - zh |
| - en |
| license: cc-by-4.0 |
| pipeline_tag: text-to-audio |
| tags: |
| - model_hub_mixin |
| - pytorch_model_hub_mixin |
| --- |
| |
| <div align="center"> |
|
|
| <h1>🎤 YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance</h1> |
|
|
| <p> |
| <a href="">English</a> | <a href="README_ZH.md">中文</a> |
| </p> |
|
|
|
|
|  |
|  |
| [](https://arxiv.org/abs/2603.24589) |
| [](https://github.com/ASLP-lab/YingMusic-Singer) |
| [](https://aslp-lab.github.io/YingMusic-Singer-Demo/) |
| [](https://huggingface.co/spaces/ASLP-lab/YingMusic-Singer) |
| [](https://huggingface.co/ASLP-lab/YingMusic-Singer) |
| [](https://huggingface.co/datasets/ASLP-lab/LyricEditBench) |
| [](https://discord.gg/RXghgWyvrn) |
| [](https://github.com/ASLP-lab/YingMusic-Singer/blob/main/assets/wechat_qr.png) |
| [](http://www.npu-aslp.org/) |
|
|
| <p> |
| <a href="https://orcid.org/0009-0005-5957-8936"><b>Chunbo Hao</b></a>¹² · |
| <a href="https://orcid.org/0009-0003-2602-2910"><b>Junjie Zheng</b></a>² · |
| <a href="https://orcid.org/0009-0001-6706-0572"><b>Guobin Ma</b></a>¹ · |
| <b>Yuepeng Jiang</b>¹ · |
| <b>Huakang Chen</b>¹ · |
| <b>Wenjie Tian</b>¹ · |
| <a href="https://orcid.org/0009-0003-9258-4006"><b>Gongyu Chen</b></a>² · |
| <a href="https://orcid.org/0009-0005-5413-6725"><b>Zihao Chen</b></a>² · |
| <b>Lei Xie</b>¹ |
| </p> |
|
|
| <p> |
| <sup>1</sup> Northwestern Polytechnical University · <sup>2</sup> Giant Network |
| </p> |
|
|
| </div> |
|
|
| ----- |
|
|
| **YingMusic-Singer** is a fully diffusion-based model enabling melody-controllable singing voice synthesis with flexible lyric manipulation. The model takes three inputs: an optional timbre reference, a melody-providing singing clip, and modified lyrics, achieving strong melody preservation and lyric adherence without requiring manual alignment. |
|
|
| For more details, please refer to the paper: [YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance](https://arxiv.org/abs/2603.24589). |
|
|
| ## 🌟 About This Repository |
|
|
| The root directory contains the packaged model weights saved via `ModelHubMixin` in safetensor format. The `ckpts/` folder holds individual component checkpoints for downstream development and custom integration. |
|
|
| ## 🚀 Getting Started |
|
|
| Full documentation and deployment guides are available at our GitHub repository: |
| 👉 [https://github.com/ASLP-lab/YingMusic-Singer](https://github.com/ASLP-lab/YingMusic-Singer) |
|
|
| We support multiple deployment options to fit your workflow. |
|
|
| ## 📜 Citation |
|
|
| If you find our work useful, please cite: |
|
|
| ```bibtex |
| @misc{hao2025yingmusicsinger, |
| title={YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance}, |
| author={Chunbo Hao and Junjie Zheng and Guobin Ma and Yuepeng Jiang and Huakang Chen and Wenjie Tian and Gongyu Chen and Zihao Chen and Lei Xie}, |
| year={2025}, |
| eprint={2603.24589}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.SD}, |
| url={https://arxiv.org/abs/2603.24589}, |
| } |
| ``` |