SoulX-Singer

Running on Zero

App Files Files Community

SoulX-Singer / README.md

Xinsheng-Wang

Upload README.md with huggingface_hub

9447616 verified 1 day ago

preview code

raw

history blame contribute delete

8.39 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

metadata

title: SoulX-Singer
emoji: 🎤
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
python_version: '3.10'
suggested_hardware: zero-a10g

🎤 SoulX-Singer

Official inference code for
SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis

🎵 Overview

SoulX-Singer is a high-fidelity, zero-shot singing voice synthesis model that enables users to generate realistic singing voices for unseen singers.
It supports melody-conditioned (F0 contour) and score-conditioned (MIDI notes) control for precise pitch, rhythm, and expression.

✨ Key Features

🎤 Zero-Shot Singing – Generate high-fidelity voices for unseen singers, no fine-tuning needed.
🎵 Flexible Control Modes – Melody (F0) and Score (MIDI) conditioning.
📚 Large-Scale Dataset – 42,000+ hours of aligned vocals, lyrics, notes across Mandarin, English, Cantonese.
🧑‍🎤 Timbre Cloning – Preserve singer identity across languages, styles, and edited lyrics.
✏️ Singing Voice Editing – Modify lyrics while keeping natural prosody.
🌐 Cross-Lingual Synthesis – High-fidelity synthesis by disentangling timbre from content.

Performance Radar

🎬 Demo Examples

https://github.com/user-attachments/assets/13306f10-3a29-46ba-bcef-d6308d05cbcc

https://github.com/user-attachments/assets/2eb260fe-6f0b-408c-aab8-5b81ddddb284

📰 News

[2026-02-06] SoulX-Singer inference code and models released.

🚀 Quick Start

Note: This repo does not ship pretrained weights. SVS and preprocessing models must be downloaded from Hugging Face (see step 3).

1. Clone Repository

git clone https://github.com/Soul-AILab/SoulX-Singer.git
cd SoulX-Singer

2. Set Up Environment

1. Install Conda (if not already installed): https://docs.conda.io/en/latest/miniconda.html

2. Create and activate a Conda environment:

conda create -n soulxsinger -y python=3.10
conda activate soulxsinger

3. Install dependencies:

pip install -r requirements.txt

⚠️ If you are in mainland China, use a PyPI mirror:

pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host=mirrors.aliyun.com

3. Download Pretrained Models

This repository does not include pretrained models. You must download them from Hugging Face:

Soul-AILab/SoulX-Singer (SVS model)
Soul-AILab/SoulX-Singer-Preprocess (preprocessing models)

Install Hugging Face Hub and download:

pip install -U huggingface_hub

# SoulX-Singer SVS model
huggingface-cli download Soul-AILab/SoulX-Singer --local-dir pretrained_models/SoulX-Singer

# Preprocessing models (vocal separation, F0, ASR, etc.)
huggingface-cli download Soul-AILab/SoulX-Singer-Preprocess --local-dir pretrained_models/SoulX-Singer-Preprocess

4. Run the Demo

Run the inference demo:

bash example/infer.sh

This script relies on metadata generated from the preprocessing pipeline, including vocal separation and transcription. Users should follow the steps in preprocess to prepare the necessary metadata before running the demo with their own data.

⚠️ Important Note The metadata produced by the automatic preprocessing pipeline may not perfectly align the singing audio with the corresponding lyrics and musical notes. For best synthesis quality, we strongly recommend manually correcting the alignment using the 🎼 Midi-Editor.

How to use the Midi-Editor:

Eiditing Metadata with Midi-Editor

🌐 WebUI

You can launch the interactive interface with:

python webui.py

🚀 Deploy as Hugging Face Space

This repo is ready to deploy as a Hugging Face Space. Pretrained models are not included; app.py downloads them from the Hub on first run.

📖 详细部署指南请查看：DEPLOY.md

快速步骤：

创建 Space：访问 huggingface.co/spaces，点击 "Create new Space"，选择 Gradio SDK
上传代码：使用 Git 推送或 Web 界面上传代码文件
配置硬件：在 Space Settings 中选择 GPU T4 Small（推荐）以加快推理速度
等待启动：Space 会自动安装依赖、下载模型并启动应用（首次运行可能需要 5-15 分钟）

模型会自动从以下仓库下载：

Soul-AILab/SoulX-Singer (SVS model)
Soul-AILab/SoulX-Singer-Preprocess (preprocessing models)

🚧 Roadmap

🖥️ Web-based UI for easy and interactive inference
🌐 Online demo deployment on Hugging Face Spaces
📊 Release the SoulX-Singer-Eval benchmark
📚 Comprehensive tutorials and usage documentation

🙏 Acknowledgements

Special thanks to the following open-source projects:

📄 License

We use the Apache 2.0 license. Researchers and developers are free to use the codes and model weights of our SoulX-Singer. Check the license at LICENSE for more details.

⚠️ Usage Disclaimer

SoulX-Singer is intended for academic research, educational purposes, and legitimate applications such as personalized singing synthesis and assistive technologies.

Please note:

🎤 Respect intellectual property, privacy, and personal consent when generating singing content.
🚫 Do not use the model to impersonate individuals without authorization or to create deceptive audio.
⚠️ The developers assume no liability for any misuse of this model.

We advocate for the responsible development and use of AI and encourage the community to uphold safety and ethical principles. For ethics or misuse concerns, please contact us.

📬 Contact Us

We welcome your feedback, questions, and collaboration:

Email: qianjiale@soulapp.cn | menghao@soulapp.cn | wangxinsheng@soulapp.cn
Join discussions: WeChat or Soul APP groups for technical discussions and updates:

WeChat Group QR Code