---
## π΅ Overview
**SoulX-Singer** is a high-fidelity, zero-shot singing voice synthesis model that enables users to generate realistic singing voices for unseen singers. It supports **melody-conditioned (F0 contour)** and **score-conditioned (MIDI notes)** control for precise pitch, rhythm, and expression.
**SoulX-Singer-SVC** is a singing voice conversion (SVC) model finetuned from **SoulX-Singer**. Singing Voice Conversion aims to transform a source singing recording into the target singerβs voice while preserving the original melody, rhythm, and lyrical content. Based on the strong generative capability of SoulX-Singer, SoulX-Singer-SVC enables high-quality singing voice conversion directly from raw singing audio, without requiring lyric or MIDI transcriptions.
---
## β¨ Key Features
#### SoulX-Singer
- **π€ Zero-Shot Singing** β Generate high-fidelity voices for unseen singers, no fine-tuning needed.
- **π΅ Flexible Control Modes** β Melody (F0) and Score (MIDI) conditioning.
- **π Large-Scale Dataset** β 42,000+ hours of aligned vocals, lyrics, notes across Mandarin, English, Cantonese.
- **π§βπ€ Timbre Cloning** β Preserve singer identity across languages, styles, and edited lyrics.
- **βοΈ Singing Voice Editing** β Modify lyrics while keeping natural prosody.
- **π Cross-Lingual Synthesis** β High-fidelity synthesis by disentangling timbre from content.
#### SoulX-Singer-SVC
- **ποΈ Zero-Shot Timbre and Style Transfer** β Transfer singer identity and style to unseen voices without per-speaker fine-tuning.
- **π Language-Agnostic Conversion** β Works across multilingual singing content.
- **π Transcription-Free Audio-to-Audio Conversion** β Convert target singing directly without lyrics transcription or MIDI inputs.
---