Update README.md
Browse files
README.md
CHANGED
|
@@ -16,45 +16,66 @@ model-index:
|
|
| 16 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 17 |
should probably proofread and complete it, then remove this comment. -->
|
| 18 |
|
| 19 |
-
# SpeechT5 TTS
|
| 20 |
|
| 21 |
-
This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts)
|
| 22 |
-
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
SAMPLE TEXT : "hello ,few technical terms i used while fine tuning are API and REST and CUDA and TTS."
|
| 26 |
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/66f64964584cae45b5494560/JYJmDNPHnBRLuvqGTJQSu.wav"></audio>
|
| 27 |
|
| 28 |
-
|
|
|
|
|
|
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
-
|
|
|
|
| 33 |
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
| 39 |
|
| 40 |
-
|
| 41 |
|
| 42 |
-
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
### Training results
|
| 60 |
|
|
|
|
| 16 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 17 |
should probably proofread and complete it, then remove this comment. -->
|
| 18 |
|
| 19 |
+
# 🎤 SpeechT5 TTS Technical Train2
|
| 20 |
|
| 21 |
+
This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) using a custom dataset, specifically trained for *Text-to-Speech (TTS)* tasks.
|
| 22 |
+
|
| 23 |
+
🎯 *Key Metric:*
|
| 24 |
+
- *Loss* on the evaluation set: 0.3763
|
| 25 |
+
|
| 26 |
+
📢 *Listen to the generated sample:*
|
| 27 |
+
|
| 28 |
+
The text is " Hello ,few technical terms i used while fine tuning are API and REST and CUDA and TTS."
|
| 29 |
|
|
|
|
| 30 |
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/66f64964584cae45b5494560/JYJmDNPHnBRLuvqGTJQSu.wav"></audio>
|
| 31 |
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## 📝 Model Description
|
| 35 |
|
| 36 |
+
The *SpeechT5 TTS Technical Train2* is built on the *SpeechT5* architecture and was fine-tuned for speech synthesis (TTS). The fine-tuning focused on improving the naturalness and clarity of the generated audio from text.
|
| 37 |
|
| 38 |
+
🛠 *Base Model*: [Microsoft SpeechT5](https://huggingface.co/microsoft/speecht5_tts)
|
| 39 |
+
📚 *Dataset*: Custom (specific details to be provided)
|
| 40 |
|
| 41 |
+
---
|
| 42 |
+
|
| 43 |
+
## 🔧 Intended Uses & Limitations
|
| 44 |
+
|
| 45 |
+
### ✅ *Primary Use Cases:*
|
| 46 |
+
- *Text-to-Speech (TTS)* for technical Interview Texts .
|
| 47 |
+
- *Virtual Assistants*:
|
| 48 |
+
|
| 49 |
|
| 50 |
+
### ⚠ *Limitations:*
|
| 51 |
+
- Best suited for English TTS tasks.
|
| 52 |
+
- Require further fine-tuning on Large dataset .
|
| 53 |
|
| 54 |
+
---
|
| 55 |
|
| 56 |
+
## 📅 Training Data
|
| 57 |
|
| 58 |
+
The model was fine-tuned on a *custom dataset*, curated for enhancing TTS outputs. This dataset consists of various types of text that help the model generate more natural speech, making it suitable for TTS applications.
|
| 59 |
|
| 60 |
+
---
|
| 61 |
|
| 62 |
+
## ⚙ Training Procedure
|
| 63 |
+
|
| 64 |
+
### ⚙ *Hyperparameters*:
|
| 65 |
+
|
| 66 |
+
The model was trained with the following hyperparameters:
|
| 67 |
+
```yaml
|
| 68 |
+
learning_rate: 1e-05
|
| 69 |
+
train_batch_size: 16
|
| 70 |
+
eval_batch_size: 8
|
| 71 |
+
seed: 42
|
| 72 |
+
gradient_accumulation_steps: 2
|
| 73 |
+
total_train_batch_size: 32
|
| 74 |
+
optimizer: adamw_torch (betas=(0.9, 0.999), epsilon=1e-08)
|
| 75 |
+
lr_scheduler_type: linear
|
| 76 |
+
lr_scheduler_warmup_steps: 50
|
| 77 |
+
training_steps: 500
|
| 78 |
+
mixed_precision_training: Native AMP
|
| 79 |
|
| 80 |
### Training results
|
| 81 |
|