| --- |
| license: gpl-3.0 |
| language: |
| - uz |
| pipeline_tag: text-to-speech |
| tags: |
| - tts |
| - text-to-speech |
| - uzbek |
| - coqui |
| - neural-tts |
| - voice-synthesis |
| --- |
| |
| # XurmoTTS |
|
|
| XurmoTTS — o‘zbek tilidagi neural Text-to-Speech (TTS) modeli bo‘lib, matndan tabiiy ovoz sintezi yaratish uchun ishlab chiqilgan. |
|
|
| Model Coqui TTS framework’i asosida Google Colab muhitida o‘qitilgan va o‘zbek tilidagi real nutq namunalaridan foydalanilgan. |
|
|
| --- |
|
|
| # Loyiha haqida |
|
|
| Ushbu modelni yaratish jarayoni taxminan 1 hafta davom etdi. Dataset tayyorlash va tozalashning katta qismi qo‘lda va maxsus skriptlar yordamida bajarildi. Butun loyiha davomida hech qancha pul sarflanmadi va mutlaqo BEPULGA yaratildi. |
|
|
| Dataset manbasi sifatida YouTube’dagi Xurmo Media kanalidagi ochiq videolardan foydalanilgan: |
|
|
| https://www.youtube.com/@Xurmomedia |
|
|
| --- |
|
|
| # Dataset tayyorlash jarayoni |
|
|
| Dataset oddiy audio yig‘ish orqali emas, bir nechta bosqichli pipeline yordamida tayyorlangan. |
|
|
| Jarayon quyidagilarni o‘z ichiga oladi: |
|
|
| - Videolardan audio ajratib olish |
| - Maxsus algoritmlar yordamida audioni kichik segmentlarga bo‘lish |
| - AI Speech Recognition tizimlari yordamida transkripsiya yaratish |
| - Audio va matn mosligini tekshirish |
| - Shovqinli va sifatsiz segmentlarni filtrlash |
| - Maxsus Python skriptlari yordamida dataset tozalash |
| - O‘zbek kirill va lotin yozuvlarini normallashtirish |
| - Modelni Coqui TTS yordamida o‘qitish |
|
|
| Dataset sifati imkon qadar yaxshilanishi uchun ko‘plab segmentlar qo‘lda tekshirilgan. |
|
|
| --- |
|
|
| # Model imkoniyatlari |
|
|
| - O‘zbek tilida nutq sintezi |
| - Neural voice synthesis |
| - Lotin va kirill yozuvlarini qo‘llab-quvvatlash |
| - Tabiiyroq intonatsiya |
| - Coqui TTS bilan mos ishlash |
|
|
| --- |
|
|
| # Ishlatish namunasi |
|
|
| ``` bash |
| pip install --upgrade pip |
| pip install cython |
| pip install llvmlite |
| pip install coqui-tts |
| ``` |
|
|
| ```python |
| from TTS.api import TTS |
| from huggingface_hub import snapshot_download |
| import os |
| |
| # Hugging Face-dan model papkasini kompyuter keshiga yuklab olish |
| model_dir = snapshot_download(repo_id="jahongirtech/XurmoTTS") |
| |
| model_path = os.path.join(model_dir, "model.pth") # Repodagi .pth fayl nomi |
| config_path = os.path.join(model_dir, "config.json") |
| |
| # Modelni ishga tushirish |
| tts = TTS(model_path=model_path, config_path=config_path, gpu=False) |
| |
| tts.tts_to_file( |
| text="Xurmo media loyihasi muvoffaqqiyatli ishlayapdi", |
| file_path="output.wav" |
| ) |
| ``` |
|
|
|
|
| --- |
|
|
| # Muhim eslatma |
|
|
| Ushbu loyiha tadqiqot va ta’lim maqsadlarida yaratilgan. |
|
|
| Modeldagi ovozlar real insonlarga tegishli bo‘lishi mumkin. Har bir insonning ovozi o‘ziga tegishli hisoblanadi va undan mas’uliyat bilan foydalanish kerak. |
|
|
| Quyidagilar tavsiya etilmaydi: |
|
|
| * Boshqa inson ovozini o‘zingizniki sifatida ko‘rsatish |
| * YouTube yoki boshqa platformalardagi insonlar ovozini noqonuniy nusxalash |
| * Aldov, fake kontent yoki impersonation yaratish |
| * Tijoriy maqsadlarda foydalanish |
| * Ovoz klonlash orqali insonlarni chalg‘itish |
|
|
| Muallif modeldan noto‘g‘ri foydalanish uchun javobgar emas. |
|
|
| --- |
|
|
| # License |
|
|
| GPL License |
|
|