Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -35,7 +35,7 @@ base_model: parler-tts/parler-tts-mini-v1.1
|
|
| 35 |
|
| 36 |
## β¨ **Key Features**
|
| 37 |
|
| 38 |
-
- **π Extensive Training Data**: Fine-tuned on 650+ hours of carefully curated, high-quality audio data
|
| 39 |
- **π₯ Comprehensive Speaker Library**: 85 distinct speaker identities with consistent, recognizable voices across different accents and demographics
|
| 40 |
- **π Advanced Expressiveness**: Precise control over tone, emotion, pitch, pace, style, reverb, and background noise through natural language descriptions
|
| 41 |
- **π¬ Technical Architecture**: Advanced two-tokenizer system enabling both prompt-based and description-based generation
|
|
@@ -43,7 +43,7 @@ base_model: parler-tts/parler-tts-mini-v1.1
|
|
| 43 |
|
| 44 |
### **Technical Specifications**
|
| 45 |
- **Base Model**: `parler-tts/parler-tts-mini-v1.1`
|
| 46 |
-
- **Training Data**: 650+ hours of curated audio (
|
| 47 |
- **Architecture**: Two-tokenizer flow for enhanced control and consistency
|
| 48 |
- **Output Quality**: 24kHz high-fidelity audio generation
|
| 49 |
|
|
@@ -59,7 +59,7 @@ Our technical evaluation demonstrates strong performance across key metrics:
|
|
| 59 |
|
| 60 |
3. **βοΈ Comparative Analysis**: Offers competitive inference speed while maintaining high audio quality at 24kHz resolution
|
| 61 |
|
| 62 |
-
4. **π Dataset Quality**: The 650+ hour curated dataset supports 85 distinct voice identities across 7 accent categories
|
| 63 |
|
| 64 |
π **[View Full Technical Report & Audio Samples](https://quilt-growth-39a.notion.site/ParlerVoice-28a776bb53f280949beef800875eb0f7?source=copy_link)**
|
| 65 |
|
|
|
|
| 35 |
|
| 36 |
## β¨ **Key Features**
|
| 37 |
|
| 38 |
+
- **π Extensive Training Data**: Fine-tuned on 650+ hours of carefully curated, high-quality proprietary audio data (dataset release coming soon!)
|
| 39 |
- **π₯ Comprehensive Speaker Library**: 85 distinct speaker identities with consistent, recognizable voices across different accents and demographics
|
| 40 |
- **π Advanced Expressiveness**: Precise control over tone, emotion, pitch, pace, style, reverb, and background noise through natural language descriptions
|
| 41 |
- **π¬ Technical Architecture**: Advanced two-tokenizer system enabling both prompt-based and description-based generation
|
|
|
|
| 43 |
|
| 44 |
### **Technical Specifications**
|
| 45 |
- **Base Model**: `parler-tts/parler-tts-mini-v1.1`
|
| 46 |
+
- **Training Data**: 650+ hours of curated proprietary audio (dataset release coming soon - stay tuned!)
|
| 47 |
- **Architecture**: Two-tokenizer flow for enhanced control and consistency
|
| 48 |
- **Output Quality**: 24kHz high-fidelity audio generation
|
| 49 |
|
|
|
|
| 59 |
|
| 60 |
3. **βοΈ Comparative Analysis**: Offers competitive inference speed while maintaining high audio quality at 24kHz resolution
|
| 61 |
|
| 62 |
+
4. **π Dataset Quality**: The 650+ hour curated proprietary dataset supports 85 distinct voice identities across 7 accent categories (public release coming soon!)
|
| 63 |
|
| 64 |
π **[View Full Technical Report & Audio Samples](https://quilt-growth-39a.notion.site/ParlerVoice-28a776bb53f280949beef800875eb0f7?source=copy_link)**
|
| 65 |
|