--- tags: - text-to-speech - tts - pytorch language: - ar license: mit --- # Multi-Speaker TTS Model Trained Text-to-Speech model with the following specs: ## Model Details - **Speakers:** 1 - **Mel Channels:** 80 - **Sample Rate:** 22050 - **Vocab Size:** 50000 ## Training Stats - **Final Mel Loss:** 5.3586 - **Validation Mel Loss:** 5.4784 - **Training Epochs:** 105 ## Mel Normalization - **Mean:** -4.8906 - **Std:** 4.9261 ## Usage ```python import torch from transformers import AutoTokenizer # Load model checkpoint = torch.load('checkpoint_best.pt') model = MultiSpeakerTTS(checkpoint['config']) model.load_state_dict(checkpoint['model_state_dict']) model.eval() # Generate tokenizer = AutoTokenizer.from_pretrained("miscovery/tokenizer_v3") text = "Your text here" tokens = tokenizer.encode(text) # ... inference code ... ``` ## Speaker Mapping { "mahmoud": 0 }