tahirturk commited on
Commit
5d5051c
·
1 Parent(s): 243278a
Files changed (3) hide show
  1. .node-version +1 -0
  2. README.md +8 -8
  3. requirements.txt +21 -43
.node-version ADDED
@@ -0,0 +1 @@
 
 
1
+ 20
README.md CHANGED
@@ -1,12 +1,12 @@
1
  ---
2
- title: Chatterbox Multilingual TTS
3
- emoji: 🗣️
4
- colorFrom: blue
5
- colorTo: pink
6
  sdk: gradio
7
- sdk_version: 4.44.0
8
  python_version: 3.10
9
  app_file: app.py
10
- enable_gpu: true
11
- preload_from_hub: true
12
- ---
 
1
  ---
2
+ title: Multi Language Cloner
3
+ emoji: 🌎
4
+ colorFrom: indigo
5
+ colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 4.43.0
8
  python_version: 3.10
9
  app_file: app.py
10
+ pinned: false
11
+ short_description: Chatterbox TTS supporting 23 languages
12
+ ---
requirements.txt CHANGED
@@ -1,45 +1,23 @@
1
- ---
2
- title: Realistic Voice Cloner 🎙️
3
- emoji: 🧠
4
- colorFrom: blue
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 4.44.0
8
- python_version: 3.10
9
- app_file: app.py
10
- hardware:
11
- - gpu
12
- tags:
13
- - text-to-speech
14
- - voice-cloning
15
- - huggingface
16
- - gradio
17
- - audio
18
- license: mit
19
- short_description: A neural voice cloning demo built with Gradio and Hugging Face Inference API.
20
- ---
21
 
22
- # 🎧 Realistic Voice Cloner
23
 
24
- This Hugging Face Space demonstrates a **neural voice cloning** pipeline built with:
25
- - **Gradio 4.44.0**
26
- - **Torch 2.2+**
27
- - **Transformers 4.46.3**
28
- - **Diffusers 0.29.0**
29
- - **Resemble-Perth**, **Silero-VAD**, and **Conformer**
30
-
31
- ## 🚀 Features
32
- - Upload a short audio sample of a speaker
33
- - Enter any text to synthesize speech in that voice
34
- - Fast inference powered by **CUDA (GPU)**
35
- - Optional language segmentation (Chinese, Japanese, Russian, etc.)
36
-
37
- ## 🧠 Tech Stack
38
- - **Backend:** PyTorch, Transformers, Diffusers
39
- - **Frontend:** Gradio
40
- - **Audio:** Librosa, SoundFile, Resampy
41
-
42
- ## ⚙️ Requirements
43
- See `requirements.txt` for all dependencies:
44
- ```bash
45
- pip install -r requirements.txt
 
1
+ gradio==4.44.0
2
+ torch>=2.2.0
3
+ numpy
4
+ soundfile
5
+ spaces
6
+ resampy==0.4.3
7
+ librosa==0.10.0
8
+ s3tokenizer
9
+ transformers==4.46.3
10
+ diffusers==0.29.0
11
+ omegaconf==2.3.0
12
+ resemble-perth==1.0.1
13
+ silero-vad==5.1.2
14
+ conformer==0.3.2
15
+ safetensors
 
 
 
 
 
16
 
 
17
 
18
+ # Optional language-specific dependencies
19
+ # Uncomment the ones you need for specific languages:
20
+ spacy_pkuseg # For Chinese text segmentation
21
+ pykakasi>=2.2.0 # For Japanese text processing (Kanji to Hiragana)
22
+ russian-text-stresser @ git+https://github.com/Vuizur/add-stress-to-epub
23
+ # dicta-onnx>=0.1.0 # For Hebrew diacritization