Professor
/

MiraTTS-Kinyarwanda-Phase1

@@ -1,7 +1,9 @@
 ---
 base_model: YatharthS/MiraTTS
 tags:
-- text-generation-inference
 - transformers
 - unsloth
 - qwen2
@@ -9,15 +11,72 @@ tags:
 - sft
 license: apache-2.0
 language:
 - en
 ---
-# Uploaded  model
 - **Developed by:** Professor
-- **License:** apache-2.0
-- **Finetuned from model :** YatharthS/MiraTTS
-This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
 base_model: YatharthS/MiraTTS
 tags:
+- text-to-speech
+- tts
+- audio-generation
 - transformers
 - unsloth
 - qwen2
 - sft
 license: apache-2.0
 language:
+- rw
 - en
 ---
+# MiraTTS Kinyarwanda (Phase 1 - Language Acquisition)
 - **Developed by:** Professor
+- **License:** Apache 2.0
+- **Finetuned from model:** YatharthS/MiraTTS
+- **Language:** Kinyarwanda (`rw`), English (`en`)
+## Model Overview
+This is a foundational Text-to-Speech (TTS) model for the Kinyarwanda language. It is built on the MiraTTS architecture (which utilizes a 0.5B parameter Qwen2.5 LLM backbone) and was fine-tuned to map Kinyarwanda text to its correct phonetic and acoustic representations.
+**Note:** This is a "Phase 1" checkpoint. It was trained on a combined dataset of high-fidelity human speech and synthetic speech to teach the model the core phonetic rules, prefixes, and rhythm of Kinyarwanda. It is capable of generating intelligible Kinyarwanda speech but may exhibit occasional synthetic artifacts or hallucinated padding. A Phase 2 model (refined strictly on human data) is recommended for production use.
+## Training Details
+The model was trained using the `Unsloth` framework for optimized hardware utilization. Training was intentionally halted early (around Epoch 10) to prevent the LLM backbone from memorizing the dataset and losing natural prosody.
+* **Dataset Size:** 28,629 audio-text pairs
+* **Effective Batch Size:** 256 (64 per device * 4 gradient accumulation steps)
+* **Total Steps Trained:** 1,189
+* **Starting Loss:** 10.84
+* **Final Loss:** 5.76
+* **Hardware:** Trained on a single NVIDIA GPU in `bfloat16` precision (where supported).
+## How to Use (Inference)
+Because this model utilizes the highly optimized Lmdeploy backend for rapid audio generation, it requires a modern NVIDIA GPU (such as an L4, A100, or G4) to run at full speed.
+Below is the standard inference script to generate Kinyarwanda audio using a reference voice clip.
+### 1. Installation
+Ensure you install the optimized `MiraTTS` library and align your PyTorch audio dependencies:
+```bash
+pip install git+https://github.com/ysharma3501/MiraTTS.git
+# Ensure torchaudio and torchvision match your active PyTorch version
+```
+### 2. Python Inference Code
+```python
+import torch
+from mira.model import MiraTTS
+from IPython.display import Audio, display
+print("Loading Kinyarwanda Phase 1 Model...")
+# Initialize the model directly from the Hub
+mira_tts = MiraTTS("Professor/MiraTTS-Kinyarwanda-Phase1")
+# Provide a path to a real, high-quality audio file to use as the voice print
+reference_audio_path = "/path/to/your/reference_audio.wav"
+test_text = "Muraho neza! Uyu munsi turimo kugerageza porogaramu nshya y'ikinyarwanda."
+# Extract voice context and synthesize
+print("Synthesizing audio...")
+context_tokens = mira_tts.encode_audio(reference_audio_path)
+audio = mira_tts.generate(test_text, context_tokens)
+# Play the audio (if running in a Jupyter/Colab notebook)
+display(Audio(audio, rate=48000))
+```
+## Limitations
+* **Hardware Constraints:** Requires a CUDA-enabled NVIDIA GPU. Running on older architectures (like the T4) requires bypassing the optimized pipeline and forcing float32 precision, which is significantly slower.
+* **End-of-Sequence Hallucinations:** Because this is an LLM-based generative model, it may occasionally continue generating extra Kinyarwanda syllables after the input text is finished.
+---
+*This model was trained 2x faster with Unsloth.*
+<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200" alt="Unsloth Made With Love"/>