nielsr HF Staff

Add pipeline tag and improve model card

f20e7ed verified about 2 months ago

4.01 kB

	---
	datasets:
	- ASLP-lab/LyricEditBench
	language:
	- zh
	- en
	license: cc-by-4.0
	pipeline_tag: text-to-audio
	tags:
	- model_hub_mixin
	- pytorch_model_hub_mixin
	---

	<div align="center">

	<h1>🎤 YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance</h1>

	<p>
	<a href="">English</a> ｜ <a href="README_ZH.md">中文</a>
	</p>


	![Python](https://img.shields.io/badge/Python-3.10-3776AB?logo=python&logoColor=white)
	![License](https://img.shields.io/badge/License-CC--BY--4.0-lightgrey)
	[![arXiv Paper](https://img.shields.io/badge/arXiv-2603.24589-b31b1b?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2603.24589)
	[![GitHub](https://img.shields.io/badge/GitHub-YingMusic--Singer-181717?logo=github&logoColor=white)](https://github.com/ASLP-lab/YingMusic-Singer)
	[![Demo Page](https://img.shields.io/badge/GitHub-Demo--Page-F0529C?logo=github&logoColor=white)](https://aslp-lab.github.io/YingMusic-Singer-Demo/)
	[![HuggingFace Space](https://img.shields.io/badge/🤗%20HuggingFace-Space-FFD21E)](https://huggingface.co/spaces/ASLP-lab/YingMusic-Singer)
	[![HuggingFace Model](https://img.shields.io/badge/🤗%20HuggingFace-Model-FF9D00)](https://huggingface.co/ASLP-lab/YingMusic-Singer)
	[![Dataset LyricEditBench](https://img.shields.io/badge/🤗%20HuggingFace-LyricEditBench-FF6F00)](https://huggingface.co/datasets/ASLP-lab/LyricEditBench)
	[![Discord](https://img.shields.io/badge/Discord-Join%20Us-5865F2?logo=discord&logoColor=white)](https://discord.gg/RXghgWyvrn)
	[![WeChat](https://img.shields.io/badge/WeChat-Group-07C160?logo=wechat&logoColor=white)](https://github.com/ASLP-lab/YingMusic-Singer/blob/main/assets/wechat_qr.png)
	[![Lab](https://img.shields.io/badge/🏫%20ASLP-Lab-4A90D9)](http://www.npu-aslp.org/)

	<p>
	<a href="https://orcid.org/0009-0005-5957-8936"><b>Chunbo Hao</b></a>¹² ·
	<a href="https://orcid.org/0009-0003-2602-2910"><b>Junjie Zheng</b></a>² ·
	<a href="https://orcid.org/0009-0001-6706-0572"><b>Guobin Ma</b></a>¹ ·
	<b>Yuepeng Jiang</b>¹ ·
	<b>Huakang Chen</b>¹ ·
	<b>Wenjie Tian</b>¹ ·
	<a href="https://orcid.org/0009-0003-9258-4006"><b>Gongyu Chen</b></a>² ·
	<a href="https://orcid.org/0009-0005-5413-6725"><b>Zihao Chen</b></a>² ·
	<b>Lei Xie</b>¹
	</p>

	<p>
	<sup>1</sup> Northwestern Polytechnical University · <sup>2</sup> Giant Network
	</p>

	</div>

	-----

	YingMusic-Singer is a fully diffusion-based model enabling melody-controllable singing voice synthesis with flexible lyric manipulation. The model takes three inputs: an optional timbre reference, a melody-providing singing clip, and modified lyrics, achieving strong melody preservation and lyric adherence without requiring manual alignment.

	For more details, please refer to the paper: [YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance](https://arxiv.org/abs/2603.24589).

	## 🌟 About This Repository

	The root directory contains the packaged model weights saved via `ModelHubMixin` in safetensor format. The `ckpts/` folder holds individual component checkpoints for downstream development and custom integration.

	## 🚀 Getting Started

	Full documentation and deployment guides are available at our GitHub repository:
	👉 [https://github.com/ASLP-lab/YingMusic-Singer](https://github.com/ASLP-lab/YingMusic-Singer)

	We support multiple deployment options to fit your workflow.

	## 📜 Citation

	If you find our work useful, please cite:

	```bibtex
	@misc{hao2025yingmusicsinger,
	title={YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance},
	author={Chunbo Hao and Junjie Zheng and Guobin Ma and Yuepeng Jiang and Huakang Chen and Wenjie Tian and Gongyu Chen and Zihao Chen and Lei Xie},
	year={2025},
	eprint={2603.24589},
	archivePrefix={arXiv},
	primaryClass={cs.SD},
	url={https://arxiv.org/abs/2603.24589},
	}
	```