TieIncred commited on
Commit
5a314e8
Β·
verified Β·
1 Parent(s): 568b42b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -35,7 +35,7 @@ base_model: parler-tts/parler-tts-mini-v1.1
35
 
36
  ## ✨ **Key Features**
37
 
38
- - **πŸ† Extensive Training Data**: Fine-tuned on 650+ hours of carefully curated, high-quality audio data
39
  - **πŸ‘₯ Comprehensive Speaker Library**: 85 distinct speaker identities with consistent, recognizable voices across different accents and demographics
40
  - **🎭 Advanced Expressiveness**: Precise control over tone, emotion, pitch, pace, style, reverb, and background noise through natural language descriptions
41
  - **πŸ”¬ Technical Architecture**: Advanced two-tokenizer system enabling both prompt-based and description-based generation
@@ -43,7 +43,7 @@ base_model: parler-tts/parler-tts-mini-v1.1
43
 
44
  ### **Technical Specifications**
45
  - **Base Model**: `parler-tts/parler-tts-mini-v1.1`
46
- - **Training Data**: 650+ hours of curated audio (Emilia YODAS subset + Expresso)
47
  - **Architecture**: Two-tokenizer flow for enhanced control and consistency
48
  - **Output Quality**: 24kHz high-fidelity audio generation
49
 
@@ -59,7 +59,7 @@ Our technical evaluation demonstrates strong performance across key metrics:
59
 
60
  3. **βš–οΈ Comparative Analysis**: Offers competitive inference speed while maintaining high audio quality at 24kHz resolution
61
 
62
- 4. **🌍 Dataset Quality**: The 650+ hour curated dataset supports 85 distinct voice identities across 7 accent categories
63
 
64
  πŸ“Š **[View Full Technical Report & Audio Samples](https://quilt-growth-39a.notion.site/ParlerVoice-28a776bb53f280949beef800875eb0f7?source=copy_link)**
65
 
 
35
 
36
  ## ✨ **Key Features**
37
 
38
+ - **πŸ† Extensive Training Data**: Fine-tuned on 650+ hours of carefully curated, high-quality proprietary audio data (dataset release coming soon!)
39
  - **πŸ‘₯ Comprehensive Speaker Library**: 85 distinct speaker identities with consistent, recognizable voices across different accents and demographics
40
  - **🎭 Advanced Expressiveness**: Precise control over tone, emotion, pitch, pace, style, reverb, and background noise through natural language descriptions
41
  - **πŸ”¬ Technical Architecture**: Advanced two-tokenizer system enabling both prompt-based and description-based generation
 
43
 
44
  ### **Technical Specifications**
45
  - **Base Model**: `parler-tts/parler-tts-mini-v1.1`
46
+ - **Training Data**: 650+ hours of curated proprietary audio (dataset release coming soon - stay tuned!)
47
  - **Architecture**: Two-tokenizer flow for enhanced control and consistency
48
  - **Output Quality**: 24kHz high-fidelity audio generation
49
 
 
59
 
60
  3. **βš–οΈ Comparative Analysis**: Offers competitive inference speed while maintaining high audio quality at 24kHz resolution
61
 
62
+ 4. **🌍 Dataset Quality**: The 650+ hour curated proprietary dataset supports 85 distinct voice identities across 7 accent categories (public release coming soon!)
63
 
64
  πŸ“Š **[View Full Technical Report & Audio Samples](https://quilt-growth-39a.notion.site/ParlerVoice-28a776bb53f280949beef800875eb0f7?source=copy_link)**
65