| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | tags: |
| | - text-to-speech |
| | - voice-cloning |
| | - f5-tts |
| | - regional-accents |
| | - uk |
| | --- |
| | |
| | # VideoAvatar.ai: UK Regional Voice Engine (v2 - 420k) ๐๏ธ๐ฌ๐ง |
| |
|
| | **Version 2.0 (Step 420,000)** - Released Feb 17, 2026. |
| |
|
| | Developed by **Shravani Limited**, this model is a state-of-the-art Zero-Shot Voice Cloning engine specifically fine-tuned to master the diverse regional accents of the United Kingdom. |
| |
|
| | ## ๐จ CRITICAL USAGE INSTRUCTION |
| | **You MUST set `text_mask_padding=False` in your inference configuration.** |
| | Failure to do so will result in "fast" or "scrambled" speech (chipmunk effect). |
| |
|
| | Use the included `config_v2.yaml` which has this setting correctly applied. |
| |
|
| | ## ๐ Capabilities |
| | - **Zero-Shot Cloning**: Clone any voice perfectly with just 3-10 seconds of reference audio. Master your own identity in seconds. |
| | - **UK Regional Mastery**: Optimized for heavy dialects: |
| | - **Northern**: Manchester, Scouse, Geordie. |
| | - **Southern**: London (Estuary & MLE). |
| | - **Celtic**: Scottish (Fife/Edinburgh), Irish (Dublin/Belfast), Welsh (Cardiff). |
| | - **Improved Prosody**: 420,000 steps of fine-tuning ensure natural pacing and intonation. |
| |
|
| | ## ๐ ๏ธ Technical Details |
| | - **Architecture**: F5-TTS (Diffusion-based Transformer). |
| | - **Checkpoint**: `uk_regional_v2_420k.pt` |
| | - **Steps**: 420,000 |
| | - **Base Model**: F5-TTS (Fine-tuned for UK Regional Mastery). |
| |
|
| | ## ๐ Validated Samples (v2) |
| | Check the `samples_v2/` folder for verification of: |
| | - Business vs. Casual tones. |
| | - Male vs. Female voices. |
| | - Northern, Southern, Scottish, Irish, Welsh accents. |
| |
|
| | ## โ๏ธ License & Ethics |
| | This model is released under the **Apache 2.0** license. Shravani Limited is committed to ethical AIโplease ensure you have the rights to any voice you attempt to clone. |
| |
|