MultilanguageCloner

Build error

tahirturk commited on Oct 26, 2025

Commit

243278a

1 Parent(s): fddba3c

node v

Files changed (1) hide show

requirements.txt CHANGED Viewed

@@ -1,23 +1,45 @@
-gradio==4.44.0
-torch>=2.2.0
-numpy
-soundfile
-spaces
-resampy==0.4.3
-librosa==0.10.0
-s3tokenizer
-transformers==4.46.3
-diffusers==0.29.0
-omegaconf==2.3.0
-resemble-perth==1.0.1
-silero-vad==5.1.2
-conformer==0.3.2
-safetensors
-# Optional language-specific dependencies
-# Uncomment the ones you need for specific languages:
- spacy_pkuseg          # For Chinese text segmentation
- pykakasi>=2.2.0       # For Japanese text processing (Kanji to Hiragana)
- russian-text-stresser @ git+https://github.com/Vuizur/add-stress-to-epub
-# dicta-onnx>=0.1.0     # For Hebrew diacritization

+---
+title: Realistic Voice Cloner 🎙️
+emoji: 🧠
+colorFrom: blue
+colorTo: indigo
+sdk: gradio
+sdk_version: 4.44.0
+python_version: 3.10
+app_file: app.py
+hardware:
+  - gpu
+tags:
+  - text-to-speech
+  - voice-cloning
+  - huggingface
+  - gradio
+  - audio
+license: mit
+short_description: A neural voice cloning demo built with Gradio and Hugging Face Inference API.
+---
+# 🎧 Realistic Voice Cloner
+This Hugging Face Space demonstrates a **neural voice cloning** pipeline built with:
+- **Gradio 4.44.0**
+- **Torch 2.2+**
+- **Transformers 4.46.3**
+- **Diffusers 0.29.0**
+- **Resemble-Perth**, **Silero-VAD**, and **Conformer**
+## 🚀 Features
+- Upload a short audio sample of a speaker
+- Enter any text to synthesize speech in that voice
+- Fast inference powered by **CUDA (GPU)**
+- Optional language segmentation (Chinese, Japanese, Russian, etc.)
+## 🧠 Tech Stack
+- **Backend:** PyTorch, Transformers, Diffusers
+- **Frontend:** Gradio
+- **Audio:** Librosa, SoundFile, Resampy
+## ⚙️ Requirements
+See `requirements.txt` for all dependencies:
+```bash
+pip install -r requirements.txt