|
|
--- |
|
|
language: |
|
|
- ar |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- text-to-speech |
|
|
- tts |
|
|
- audio |
|
|
- speech |
|
|
- palestinian-arabic |
|
|
- arabic |
|
|
- voice-cloning |
|
|
- miratts |
|
|
- sofelia |
|
|
base_model: YatharthS/MiraTTS |
|
|
library_name: transformers |
|
|
pipeline_tag: text-to-speech |
|
|
--- |
|
|
|
|
|
<div style="text-align: center;"> |
|
|
<h1>π΅πΈ Sofelia-TTS π΅πΈ</h1> |
|
|
<p><strong>Palestinian Arabic Text-to-Speech Model</strong></p> |
|
|
<p><em>Palestine will be free</em> ποΈ</p> |
|
|
<p><img style="margin: auto; width: 500px" src="https://huggingface.co/hamdallah/Sofelia-TTS/resolve/main/Sofelia.png" /></p> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Description |
|
|
|
|
|
**Sofelia-TTS** is a fine-tuned Text-to-Speech (TTS) model specifically trained for **Palestinian Arabic dialect**. This model brings the beautiful sounds of Palestinian speech to AI, preserving and celebrating the linguistic heritage of Palestine. |
|
|
|
|
|
Built on top of [YatharthS/MiraTTS](https://huggingface.co/YatharthS/MiraTTS), Sofelia-TTS captures the unique phonetic characteristics, intonation patterns, and prosody of Palestinian Arabic, making it ideal for: |
|
|
|
|
|
- ποΈ **Voice cloning** with Palestinian Arabic speech |
|
|
- π **Audiobook generation** in Palestinian dialect |
|
|
- π£οΈ **Virtual assistants** that speak authentic Palestinian Arabic |
|
|
- π **Educational tools** for learning and preserving the Palestinian dialect |
|
|
- π¬ **Content creation** for Palestinian media and storytelling |
|
|
|
|
|
> **Dedicated to Palestine**: This model is a tribute to the resilience, culture, and spirit of the Palestinian people. May their voices be heard loud and clear across the world. π΅πΈ |
|
|
|
|
|
--- |
|
|
|
|
|
## π― Key Features |
|
|
|
|
|
- β
**High-quality voice cloning**: Clone any voice with just a few seconds of reference audio |
|
|
- β
**Palestinian Arabic dialect**: Authentic pronunciation and intonation |
|
|
- β
**Fast inference**: Optimized for real-time generation |
|
|
- β
**Flexible context**: Supports variable-length reference audio |
|
|
- β
**Open source**: Free to use and improve |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Details |
|
|
|
|
|
| **Attribute** | **Value** | |
|
|
|---------------|-----------| |
|
|
| **Model Type** | Text-to-Speech (TTS) | |
|
|
| **Base Model** | YatharthS/MiraTTS | |
|
|
| **Architecture** | Transformer-based Language Model + Audio Codec | |
|
|
| **Training Language** | Palestinian Arabic (ar-PS) | |
|
|
| **Dataset** | Private Dataset | |
|
|
| **Sample Rate** | 16,000 Hz | |
|
|
| **License** | Apache 2.0 | |
|
|
| **Model Size** | ~0.6B parameters | |
|
|
| **Precision** | BF16/FP32 | |
|
|
| **Framework** | PyTorch + Transformers | |
|
|
|
|
|
--- |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
# Install required packages |
|
|
uv pip install git+https://github.com/ysharma3501/MiraTTS.git |
|
|
``` |
|
|
|
|
|
### Usage (Python) |
|
|
|
|
|
```python |
|
|
from mira.model import MiraTTS |
|
|
from IPython.display import Audio |
|
|
mira_tts = MiraTTS('hamdallah/Sofelia-TTS') ## downloads model from huggingface |
|
|
|
|
|
file = "reference_file.wav" ## can be mp3/wav/ogg or anything that librosa supports |
|
|
text = "Ω
Ψ±ΨΨ¨Ψ§Ψ ΩΩΩ Ψ§ΩΨΨ§ΩΨ ΩΨ°Ψ§ ΩΩ
ΩΨ°Ψ¬ ΩΩΩΨ¬Ψ© Ψ§ΩΩΩΨ³Ψ·ΩΩΩΨ©." |
|
|
|
|
|
context_tokens = mira_tts.encode_audio(file) |
|
|
audio = mira_tts.generate(text, context_tokens) |
|
|
|
|
|
Audio(audio, rate=48000) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π€ Example Prompts |
|
|
|
|
|
Try these Palestinian Arabic phrases: |
|
|
|
|
|
```python |
|
|
# Greetings |
|
|
"Ω
Ψ±ΨΨ¨Ψ§Ψ ΩΩΩ ΨΨ§ΩΩΨ" # Hello, how are you? |
|
|
"Ψ£ΩΩΨ§ ΩΨ³ΩΩΨ§ ΩΩΩ" # Welcome |
|
|
|
|
|
# Common expressions |
|
|
"ΩΨ§ Ψ³ΩΨ§Ω
Ψ ΩΨ°Ψ§ Ψ±Ψ§Ψ¦ΨΉ" # Wow, this is amazing |
|
|
"Ω
Ψ§ Ψ΄Ψ§Ψ‘ Ψ§ΩΩΩ" # Mashallah |
|
|
"Ψ§ΩΩΩ ΩΨΉΨ·ΩΩ Ψ§ΩΨΉΨ§ΩΩΨ©" # God give you wellness |
|
|
|
|
|
# About Palestine |
|
|
"ΩΩΨ³Ψ·ΩΩ ΨΨ±Ψ© ΨΉΩΩ Ψ·ΩΩ" # Palestine is free for ever |
|
|
"Ψ§ΩΩΨ―Ψ³ ΨΉΨ§Ψ΅Ω
Ψ© ΩΩΨ³Ψ·ΩΩ Ψ§ΩΨ£Ψ¨Ψ―ΩΨ©" # Jerusalem is the eternal capital of Palestine |
|
|
"Ψ³ΩΨΉΩΨ― ΩΩΩ
Ψ§Ω Ψ₯ΩΩ Ψ―ΩΨ§Ψ±ΩΨ§" # We will return one day to our homes |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
- **Dataset**: 400 Hours Palestinian Speech |
|
|
- **Language**: Palestinian Arabic dialect |
|
|
- **Hours of audio**: High-quality Palestinian speech recordings |
|
|
- **Preprocessing**: Audio normalized and resampled to 16kHz |
|
|
|
|
|
### Training Configuration |
|
|
|
|
|
| **Hyperparameter** | **Value** | |
|
|
|--------------------|-----------| |
|
|
| **Learning Rate** | 2e-4 (initial), 1e-5 (refinement) | |
|
|
| **Batch Size** | 8 (effective: 2 per device Γ 4 accumulation steps) | |
|
|
| **Training Steps** | 2000+ | |
|
|
| **Warmup Steps** | 100 | |
|
|
| **Max Audio Length** | 20-30 seconds | |
|
|
| **Optimizer** | AdamW | |
|
|
| **LR Scheduler** | Cosine with warmup | |
|
|
| **Gradient Clipping** | 1.0 | |
|
|
| **Precision** | BF16 (H100) / FP32 | |
|
|
| **Hardware** | NVIDIA H100 / A100 GPU | |
|
|
|
|
|
### Training Process |
|
|
|
|
|
The model was trained using a two-phase approach: |
|
|
|
|
|
1. **Foundation Phase**: High learning rate (2e-4) for initial adaptation to Palestinian Arabic |
|
|
2. **Refinement Phase**: Lower learning rate (1e-5) with NEFTune noise for stability and quality |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Performance |
|
|
|
|
|
The model achieves: |
|
|
|
|
|
- β
**Natural prosody** matching Palestinian Arabic speech patterns |
|
|
- β
**Clear pronunciation** of Arabic phonemes |
|
|
- β
**Voice similarity** to reference audio |
|
|
- β
**Stable generation** without artifacts or repetitions |
|
|
- β
**Fast inference** suitable for real-time applications |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ Advanced Usage |
|
|
|
|
|
### Running the model using batching |
|
|
|
|
|
```python |
|
|
file = "reference_file.wav" ## can be mp3/wav/ogg or anything that librosa supports |
|
|
text = ["Ω
Ψ±ΨΨ¨Ψ§Ψ ΩΩΩ ΨΨ§ΩΩΨ", "Ψ¨ΨͺΨΉΨ±Ω Ψ₯ΩΩ Ψ§ΩΨ§ Ψ¨ΩΨ―Ψ± Ψ§ΨΩΩ ΩΩΨ³Ψ·ΩΩΩ Ω English Ω
ΨΉ Ψ¨ΨΉΨΆ Without Errors."] |
|
|
|
|
|
context_tokens = [mira_tts.encode_audio(file)] |
|
|
|
|
|
audio = mira_tts.batch_generate(text, context_tokens) |
|
|
|
|
|
Audio(audio, rate=48000) |
|
|
``` |
|
|
|
|
|
### Adjusting Generation Parameters |
|
|
|
|
|
```python |
|
|
# More creative/variable output |
|
|
|
|
|
mira_tts.set_params( |
|
|
top_p=0.95, |
|
|
top_k=20, |
|
|
temperature=0.01, # Higher = more variation |
|
|
max_new_tokens=1024, |
|
|
repetition_penalty=2.2, |
|
|
min_p=0.05 |
|
|
) |
|
|
|
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π‘ Tips for Best Results |
|
|
|
|
|
1. **Reference Audio Quality**: |
|
|
- Use clean audio without background noise |
|
|
- 3-10 seconds of speech is ideal |
|
|
- Ensure audio is 16kHz sample rate |
|
|
|
|
|
2. **Text Input**: |
|
|
- Use proper Arabic script (not Arabizi/transliteration) |
|
|
- Palestinian dialect works best |
|
|
- Avoid very long sentences (split into shorter segments) |
|
|
|
|
|
3. **Generation Parameters**: |
|
|
- `temperature=0.7`: Good default for natural speech |
|
|
- `temperature=0.5`: More stable, less variation |
|
|
- `temperature=0.9`: More expressive, more variation |
|
|
|
|
|
--- |
|
|
|
|
|
## π About Palestinian Arabic |
|
|
|
|
|
Palestinian Arabic is a Levantine Arabic dialect spoken by the Palestinian people. It has unique characteristics: |
|
|
|
|
|
- **Phonology**: Preservation of Classical Arabic /q/ as glottal stop [Κ] |
|
|
- **Vocabulary**: Rich in Levantine and unique Palestinian terms |
|
|
- **Intonation**: Distinctive melodic patterns |
|
|
- **Regional Variants**: Urban (Jerusalem, Hebron) vs. Rural vs. Bedouin varieties |
|
|
|
|
|
This model captures these linguistic features, making it authentic and representative of Palestinian speech. |
|
|
|
|
|
--- |
|
|
|
|
|
## π΅πΈ Message of Solidarity |
|
|
|
|
|
This model is dedicated to the Palestinian people and their enduring struggle for freedom, dignity, and justice. Through technology, we preserve and celebrate Palestinian culture, language, and identity. |
|
|
|
|
|
**Free Palestine** π΅πΈ |
|
|
|
|
|
> *"We will not be erased. Our voices will echo through time, in every language model, every algorithm, every line of code. Palestine lives, and so does its voice."* |
|
|
|
|
|
--- |
|
|
|
|
|
## π License |
|
|
|
|
|
This model is released under the **Apache 2.0 License**, making it free for: |
|
|
- β
Commercial use |
|
|
- β
Modification and distribution |
|
|
- β
Private use |
|
|
- β
Patent use |
|
|
|
|
|
--- |
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- **Base Model**: [YatharthS/MiraTTS](https://huggingface.co/YatharthS/MiraTTS) - Thank you for the excellent foundation |
|
|
- **Dataset**: Palestinian Arabic speakers who contributed their voices |
|
|
- **Community**: The open-source AI community for tools and support |
|
|
- **Palestine**: For being the inspiration and purpose behind this work |
|
|
|
|
|
--- |
|
|
|
|
|
## π Contact & Support |
|
|
|
|
|
- **Model Repository**: [hamdallah/Sofelia-TTS](https://huggingface.co/hamdallah/Sofelia-TTS) |
|
|
- **Issues & Questions**: Use the Community tab or open an issue |
|
|
|
|
|
--- |
|
|
|
|
|
## π Related Resources |
|
|
|
|
|
- [YatharthS/MiraTTS](https://huggingface.co/YatharthS/MiraTTS) - Base model |
|
|
- [ncodec](https://github.com/YatharthS/ncodec) - Audio codec library |
|
|
|
|
|
--- |
|
|
|
|
|
## π Citation |
|
|
|
|
|
If you use this model in your research or projects, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{sofelia-tts-2026, |
|
|
author = {Hamdallah}, |
|
|
title = {Sofelia-TTS: Palestinian Arabic Text-to-Speech Model}, |
|
|
year = {2026}, |
|
|
publisher = {Hugging Face}, |
|
|
journal = {Hugging Face Model Hub}, |
|
|
howpublished = {\url{https://huggingface.co/hamdallah/Sofelia-TTS}}, |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
<div style="text-align: center; padding: 20px;"> |
|
|
<h2>π΅πΈ FREE PALESTINE π΅πΈ</h2> |
|
|
<p><strong>ΨͺΨΩΨ§ ΩΩΨ³Ψ·ΩΩ ΨΨ±Ψ© Ψ£Ψ¨ΩΨ©</strong></p> |
|
|
<p><em>Long Live Free Palestine</em></p> |
|
|
<p>ποΈ β π΅πΈ</p> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
**Made with β€οΈ for Palestine** |