Configuration Parsing Warning:Invalid JSON for config file config.json

🇷🇸 sr-cv-vits

Serbian Text-to-Speech model based on VITS architecture.


📌 Model Overview

sr-cv-vits is a high-quality single-speaker Serbian TTS model trained using the VITS architecture and the Coqui TTS framework.

The model generates natural-sounding speech in Serbian (Cyrillic script).

  • 🗣️ Language: Serbian (Ćirilica)
  • 👩 Speaker: Female (single-speaker)
  • 🧠 Architecture: VITS
  • ⚙️ Framework: Coqui TTS
  • 🔤 Text processing: multilingual_cleaners
  • 📚 Dataset formatter: common_voice
  • 🔊 Sampling rate: 22050 Hz

🎧 Audio Samples

Below are example outputs generated by the model:

▶️ Sample 1

▶️ Sample 2

📌 Place test1.wav and test2.wav in the root of the repository.


📊 Training Details

  • Epochs: 1000
  • Dataset size: ~8 hours of speech
  • Speaker: Female (volunteer)
  • Script: Cyrillic only
  • Numbers: Not supported (removed from dataset)

📚 Dataset

The model is trained on a dataset formatted using the common_voice formatter. The dataset contains approximately 8 hours of curated Serbian speech data.

⚠️ Note: Numerical expressions were excluded during training.


⚙️ Usage

This model can be used via Coqui TTS.

🐍 Python Example

from TTS.api import TTS

# Load model from Hugging Face
tts = TTS(model_name="daremc86/sr-cv-vits")

# Generate speech
tts.tts_to_file(
    text="Здраво, како си?",
    file_path="output.wav"
)

📦 Installation

Install Coqui TTS:

pip install coqui-tts

🔊 Example Text

Здраво, како си?
Ово је пример синтезе говора.

⚠️ Limitations

  • ❌ Does not support numbers (e.g. "123")

  • 🔤 Works best with Cyrillic script only

  • 📉 May struggle with:

    • long or complex sentences
    • uncommon words
    • punctuation-heavy text

📜 License

This model is released under the CC-BY-4.0 License.


👤 Author

Darko Milošević 📧 daremc86@gmail.com


🤝 Acknowledgements

This model is based on the VITS architecture and trained using the Coqui TTS framework.


🚀 Future Improvements

  • Support for numbers
  • Latin script support
  • Larger dataset
  • Multi-speaker version

🧩 Notes

For best results:

  • Use clean, well-structured sentences
  • Avoid numbers and special symbols
  • Prefer standard Serbian grammar and punctuation
Downloads last month
58
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support