datasets:
- ASLP-lab/LyricEditBench
language:
- zh
- en
license: cc-by-4.0
pipeline_tag: text-to-audio
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
🎤 YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance
Chunbo Hao¹² · Junjie Zheng² · Guobin Ma¹ · Yuepeng Jiang¹ · Huakang Chen¹ · Wenjie Tian¹ · Gongyu Chen² · Zihao Chen² · Lei Xie¹
1 Northwestern Polytechnical University · 2 Giant Network
YingMusic-Singer is a fully diffusion-based model enabling melody-controllable singing voice synthesis with flexible lyric manipulation. The model takes three inputs: an optional timbre reference, a melody-providing singing clip, and modified lyrics, achieving strong melody preservation and lyric adherence without requiring manual alignment.
For more details, please refer to the paper: YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance.
🌟 About This Repository
The root directory contains the packaged model weights saved via ModelHubMixin in safetensor format. The ckpts/ folder holds individual component checkpoints for downstream development and custom integration.
🚀 Getting Started
Full documentation and deployment guides are available at our GitHub repository: 👉 https://github.com/ASLP-lab/YingMusic-Singer
We support multiple deployment options to fit your workflow.
📜 Citation
If you find our work useful, please cite:
@misc{hao2025yingmusicsinger,
title={YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance},
author={Chunbo Hao and Junjie Zheng and Guobin Ma and Yuepeng Jiang and Huakang Chen and Wenjie Tian and Gongyu Chen and Zihao Chen and Lei Xie},
year={2025},
eprint={2603.24589},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2603.24589},
}