drbaph
/

SoulX-Singer

@@ -15,27 +15,92 @@ tags:
 ---
-# converted main .pt model to .safetensors
-# bf16 + fp32
 ---
 <div align="center">
-    <b><em> Towards High-Quality Zero-Shot Singing Voice Synthesis</em></b>
-    </p>
     <p>
     <img src="assets/soulx-logo.png" alt="SoulX-Singer_Logo" style="height: 80px;">
     </p>
     <p>
-    </p>
     <a href="https://soul-ailab.github.io/soulx-singer/"><img src="https://img.shields.io/badge/Demo-Page-lightgrey" alt="version"></a>
     <a href="https://github.com/Soul-AILab/SoulX-Singer"><img src='https://img.shields.io/badge/Github-Page-green' alt="Github"></a>
     <a href="https://arxiv.org/abs/2602.07803"><img src="https://img.shields.io/badge/arXiv-2602.07803-b31b1b" alt="arXiv"></a>
     <a href="https://github.com/Soul-AILab/SoulX-Singer/blob/main/assets/technical-report.pdf"><img src='https://img.shields.io/badge/Report-Github?label=Technical&color=red' alt="technical report"></a>
     <a href="https://github.com/Soul-AILab/SoulX-Singer"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="Apache-2.0"></a>
 </div>
 **SoulX-Singer** is a high-fidelity, zero-shot singing voice synthesis model that enables users to generate realistic singing voices for unseen singers. It supports melody-conditioned (F0 contour) and score-conditioned (MIDI notes) control for precise pitch, rhythm, and expression.
-For more details, please refer to the paper: [SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis](https://arxiv.org/abs/2602.07803).

 ---
+## ComfyUI Custom Node
+This repository includes a custom node for ComfyUI integration:
+🔗 **[ComfyUI-SoulX-Singer](https://github.com/Saganaki22/ComfyUI-SoulX-Singer)**
+![Screenshot 2026-02-11 160905](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/FqxVnkFDrVt287ppwQj90.png)
+Use this custom node to integrate SoulX-Singer into your ComfyUI workflows for seamless singing voice synthesis.
+# SoulX-Singer: Converted .pt model to .safetensors
+**bf16 + fp32**
+## Audio Samples
+### Original Audio
+<audio controls>
+  <source src="samples/song.mp3" type="audio/mpeg">
+  Your browser does not support the audio element.
+</audio>
+### SpongeBob Voice
+<audio controls>
+  <source src="samples/generated/sample-1.mp3" type="audio/mpeg">
+  Your browser does not support the audio element.
+</audio>
+### Male Voice
+<audio controls>
+  <source src="samples/generated/sample-2.mp3" type="audio/mpeg">
+  Your browser does not support the audio element.
+</audio>
 ---
 <div align="center">
+    <b><em>Towards High-Quality Zero-Shot Singing Voice Synthesis</em></b>
     <p>
     <img src="assets/soulx-logo.png" alt="SoulX-Singer_Logo" style="height: 80px;">
     </p>
     <p>
     <a href="https://soul-ailab.github.io/soulx-singer/"><img src="https://img.shields.io/badge/Demo-Page-lightgrey" alt="version"></a>
     <a href="https://github.com/Soul-AILab/SoulX-Singer"><img src='https://img.shields.io/badge/Github-Page-green' alt="Github"></a>
     <a href="https://arxiv.org/abs/2602.07803"><img src="https://img.shields.io/badge/arXiv-2602.07803-b31b1b" alt="arXiv"></a>
     <a href="https://github.com/Soul-AILab/SoulX-Singer/blob/main/assets/technical-report.pdf"><img src='https://img.shields.io/badge/Report-Github?label=Technical&color=red' alt="technical report"></a>
     <a href="https://github.com/Soul-AILab/SoulX-Singer"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="Apache-2.0"></a>
+    </p>
 </div>
+---
+## Overview
 **SoulX-Singer** is a high-fidelity, zero-shot singing voice synthesis model that enables users to generate realistic singing voices for unseen singers. It supports melody-conditioned (F0 contour) and score-conditioned (MIDI notes) control for precise pitch, rhythm, and expression.
+For more details, please refer to the paper: [SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis](https://arxiv.org/abs/2602.07803).
+---
+## Features
+- **Zero-shot synthesis**: Generate singing voices for unseen singers without fine-tuning
+- **Melody-conditioned control**: Use F0 contour for pitch guidance
+- **Score-conditioned control**: Use MIDI notes for precise musical notation
+- **High-fidelity output**: Realistic vocal synthesis with natural expression
+- **Safetensors format**: Optimized model weights in bf16 + fp32 precision
+---
+## Citation
+If you use SoulX-Singer in your research, please cite:
+```bibtex
+@article{soulxsinger2025,
+  title={SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis},
+  author={Soul-AILab},
+  journal={arXiv preprint arXiv:2602.07803},
+  year={2025}
+}
+```
+---
+## License
+This project is licensed under the Apache License 2.0.