| tags: | |
| - text-to-speech | |
| - tts | |
| - pytorch | |
| language: | |
| - ar | |
| license: mit | |
| # Multi-Speaker TTS Model | |
| Trained Text-to-Speech model with the following specs: | |
| ## Model Details | |
| - **Speakers:** 1 | |
| - **Mel Channels:** 80 | |
| - **Sample Rate:** 22050 | |
| - **Vocab Size:** 50000 | |
| ## Training Stats | |
| - **Final Mel Loss:** 5.3586 | |
| - **Validation Mel Loss:** 5.4784 | |
| - **Training Epochs:** 105 | |
| ## Mel Normalization | |
| - **Mean:** -4.8906 | |
| - **Std:** 4.9261 | |
| ## Usage | |
| ```python | |
| import torch | |
| from transformers import AutoTokenizer | |
| # Load model | |
| checkpoint = torch.load('checkpoint_best.pt') | |
| model = MultiSpeakerTTS(checkpoint['config']) | |
| model.load_state_dict(checkpoint['model_state_dict']) | |
| model.eval() | |
| # Generate | |
| tokenizer = AutoTokenizer.from_pretrained("miscovery/tokenizer_v3") | |
| text = "Your text here" | |
| tokens = tokenizer.encode(text) | |
| # ... inference code ... | |
| ``` | |
| ## Speaker Mapping | |
| { | |
| "mahmoud": 0 | |
| } | |