TieIncred
/

ParlerVoice

text-generation

voice-synthesis

Model card Files Files and versions

TieIncred commited on Oct 13, 2025

Commit

5a314e8

·

verified ·

1 Parent(s): 568b42b

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ base_model: parler-tts/parler-tts-mini-v1.1
 ## ✨ **Key Features**
-- **🏆 Extensive Training Data**: Fine-tuned on 650+ hours of carefully curated, high-quality audio data
 - **👥 Comprehensive Speaker Library**: 85 distinct speaker identities with consistent, recognizable voices across different accents and demographics
 - **🎭 Advanced Expressiveness**: Precise control over tone, emotion, pitch, pace, style, reverb, and background noise through natural language descriptions
 - **🔬 Technical Architecture**: Advanced two-tokenizer system enabling both prompt-based and description-based generation
@@ -43,7 +43,7 @@ base_model: parler-tts/parler-tts-mini-v1.1
 ### **Technical Specifications**
 - **Base Model**: `parler-tts/parler-tts-mini-v1.1`
-- **Training Data**: 650+ hours of curated audio (Emilia YODAS subset + Expresso)
 - **Architecture**: Two-tokenizer flow for enhanced control and consistency
 - **Output Quality**: 24kHz high-fidelity audio generation
@@ -59,7 +59,7 @@ Our technical evaluation demonstrates strong performance across key metrics:
 3. **⚖️ Comparative Analysis**: Offers competitive inference speed while maintaining high audio quality at 24kHz resolution
-4. **🌍 Dataset Quality**: The 650+ hour curated dataset supports 85 distinct voice identities across 7 accent categories
 📊 **[View Full Technical Report & Audio Samples](https://quilt-growth-39a.notion.site/ParlerVoice-28a776bb53f280949beef800875eb0f7?source=copy_link)**

 ## ✨ **Key Features**
+- **🏆 Extensive Training Data**: Fine-tuned on 650+ hours of carefully curated, high-quality proprietary audio data (dataset release coming soon!)
 - **👥 Comprehensive Speaker Library**: 85 distinct speaker identities with consistent, recognizable voices across different accents and demographics
 - **🎭 Advanced Expressiveness**: Precise control over tone, emotion, pitch, pace, style, reverb, and background noise through natural language descriptions
 - **🔬 Technical Architecture**: Advanced two-tokenizer system enabling both prompt-based and description-based generation
 ### **Technical Specifications**
 - **Base Model**: `parler-tts/parler-tts-mini-v1.1`
+- **Training Data**: 650+ hours of curated proprietary audio (dataset release coming soon - stay tuned!)
 - **Architecture**: Two-tokenizer flow for enhanced control and consistency
 - **Output Quality**: 24kHz high-fidelity audio generation
 3. **⚖️ Comparative Analysis**: Offers competitive inference speed while maintaining high audio quality at 24kHz resolution
+4. **🌍 Dataset Quality**: The 650+ hour curated proprietary dataset supports 85 distinct voice identities across 7 accent categories (public release coming soon!)
 📊 **[View Full Technical Report & Audio Samples](https://quilt-growth-39a.notion.site/ParlerVoice-28a776bb53f280949beef800875eb0f7?source=copy_link)**