metadata
license: apache-2.0
language:
- en
tags:
- text-to-speech
- voice-cloning
- f5-tts
- regional-accents
- uk
VideoAvatar.ai: UK Regional Voice Engine (v2 - 420k) ποΈπ¬π§
Version 2.0 (Step 420,000) - Released Feb 17, 2026.
Developed by Shravani Limited, this model is a state-of-the-art Zero-Shot Voice Cloning engine specifically fine-tuned to master the diverse regional accents of the United Kingdom.
π¨ CRITICAL USAGE INSTRUCTION
You MUST set text_mask_padding=False in your inference configuration.
Failure to do so will result in "fast" or "scrambled" speech (chipmunk effect).
Use the included config_v2.yaml which has this setting correctly applied.
π Capabilities
- Zero-Shot Cloning: Clone any voice perfectly with just 3-10 seconds of reference audio. Master your own identity in seconds.
- UK Regional Mastery: Optimized for heavy dialects:
- Northern: Manchester, Scouse, Geordie.
- Southern: London (Estuary & MLE).
- Celtic: Scottish (Fife/Edinburgh), Irish (Dublin/Belfast), Welsh (Cardiff).
- Improved Prosody: 420,000 steps of fine-tuning ensure natural pacing and intonation.
π οΈ Technical Details
- Architecture: F5-TTS (Diffusion-based Transformer).
- Checkpoint:
uk_regional_v2_420k.pt - Steps: 420,000
- Base Model: F5-TTS (Fine-tuned for UK Regional Mastery).
π Validated Samples (v2)
Check the samples_v2/ folder for verification of:
- Business vs. Casual tones.
- Male vs. Female voices.
- Northern, Southern, Scottish, Irish, Welsh accents.
βοΈ License & Ethics
This model is released under the Apache 2.0 license. Shravani Limited is committed to ethical AIβplease ensure you have the rights to any voice you attempt to clone.