agawane's picture
Cleanup: Remove India reference from Model Card
641d896 verified
metadata
license: apache-2.0
language:
  - en
tags:
  - text-to-speech
  - voice-cloning
  - f5-tts
  - regional-accents
  - uk

VideoAvatar.ai: UK Regional Voice Engine (v2 - 420k) πŸŽ™οΈπŸ‡¬πŸ‡§

Version 2.0 (Step 420,000) - Released Feb 17, 2026.

Developed by Shravani Limited, this model is a state-of-the-art Zero-Shot Voice Cloning engine specifically fine-tuned to master the diverse regional accents of the United Kingdom.

🚨 CRITICAL USAGE INSTRUCTION

You MUST set text_mask_padding=False in your inference configuration. Failure to do so will result in "fast" or "scrambled" speech (chipmunk effect).

Use the included config_v2.yaml which has this setting correctly applied.

🌟 Capabilities

  • Zero-Shot Cloning: Clone any voice perfectly with just 3-10 seconds of reference audio. Master your own identity in seconds.
  • UK Regional Mastery: Optimized for heavy dialects:
    • Northern: Manchester, Scouse, Geordie.
    • Southern: London (Estuary & MLE).
    • Celtic: Scottish (Fife/Edinburgh), Irish (Dublin/Belfast), Welsh (Cardiff).
  • Improved Prosody: 420,000 steps of fine-tuning ensure natural pacing and intonation.

πŸ› οΈ Technical Details

  • Architecture: F5-TTS (Diffusion-based Transformer).
  • Checkpoint: uk_regional_v2_420k.pt
  • Steps: 420,000
  • Base Model: F5-TTS (Fine-tuned for UK Regional Mastery).

πŸš€ Validated Samples (v2)

Check the samples_v2/ folder for verification of:

  • Business vs. Casual tones.
  • Male vs. Female voices.
  • Northern, Southern, Scottish, Irish, Welsh accents.

βš–οΈ License & Ethics

This model is released under the Apache 2.0 license. Shravani Limited is committed to ethical AIβ€”please ensure you have the rights to any voice you attempt to clone.