| # Moroccan Darija Text-to-Speech Model | |
| This model is a fine-tuned version of SpeechT5 for Moroccan Darija Text-to-Speech synthesis. | |
| ## Model Details | |
| - **Base Model**: Microsoft SpeechT5 | |
| - **Fine-tuned on**: DODa audio dataset | |
| - **Languages**: Moroccan Darija (Latin script) | |
| - **Features**: Multiple voice support (male/female) | |
| - **Release Date**: April 2025 | |
| ## Usage | |
| ```python | |
| from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan | |
| import torch | |
| import soundfile as sf | |
| # Load models | |
| processor = SpeechT5Processor.from_pretrained("HAMMALE/speecht5-darija") | |
| model = SpeechT5ForTextToSpeech.from_pretrained("HAMMALE/speecht5-darija") | |
| vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan") | |
| # Load speaker embedding (replace with your own speaker embedding) | |
| speaker_embedding = torch.randn(1, 512) # Example embedding | |
| # Process text | |
| text = "Salam, kifach nta lyoum?" | |
| inputs = processor(text=text, return_tensors="pt") | |
| # Generate speech | |
| speech = model.generate_speech(inputs["input_ids"], speaker_embedding, vocoder=vocoder) | |
| # Save audio file | |
| sf.write("output.wav", speech.numpy(), 16000) | |
| ``` | |
| ## Demo | |
| A live demo is available at [Hugging Face Spaces](https://huggingface.co/spaces/HAMMALE/darija-tts-demo) | |
| ## License | |
| This model is available under the MIT License. | |
| ## Acknowledgments | |
| - The [DODa audio dataset](https://huggingface.co/datasets/atlasia/DODa-audio-dataset) creators | |
| - Microsoft Research for the SpeechT5 model architecture | |