You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Chatterbox β€” Gujarati fine-tune

A fine-tuned Chatterbox TTS model for Gujarati text-to-speech with voice cloning and emotion control.

Model details

Attribute Value
Base model ResembleAI/chatterbox (Multilingual)
Architecture CosyVoice 2.0 β€” LLaMA-based T3 (0.5B params)
Language Gujarati (gu)
Training hardware NVIDIA L4 (24 GB VRAM)
Fine-tuning repo gokhaneraslan/chatterbox-finetuning
Vocab extension 2454 β†’ 2514 tokens (+60 Gujarati characters)

Training data

Fine-tuned on Gujarati clips from Arjun4707/gu-hi-tts (~33,851 clips after CPS + diarization filtering).

Data source: Audio clips scraped from publicly available YouTube videos. Preprocessed with speaker diarization (pyannote-audio) to keep only single-speaker clips, CPS filtered (4-25 chars/sec), duration filtered (2-20s).

Known limitations

  • Very short utterances (1-3 words) produce poor quality β€” architecture needs minimum ~5 words
  • Medium to long sentences (5-30 words) produce good quality with clear Gujarati pronunciation
  • Not suitable for sub-2s audio generation

Training code

Full training pipeline and troubleshooting: BhammarArjun/TTS_4_training

License

CC-BY-NC-4.0 β€” Non-commercial use only.

The base Chatterbox model is MIT-licensed, but this fine-tuned version uses YouTube-sourced training data. To be transparent and responsible about data provenance, we apply CC-BY-NC-4.0 to this fine-tuned version.

Citation

@misc{arjun2026chatterboxgu,
  title   = {Chatterbox fine-tuned for Gujarati},
  author  = {Arjun Bhammar},
  year    = {2026},
  url     = {https://huggingface.co/Arjun4707/chatterbox-gujarati}
}

Acknowledgements

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Arjun4707/chatterbox-gujarati

Finetuned
(41)
this model