garyuzair commited on
Commit
a488285
Β·
verified Β·
1 Parent(s): 9290799

Upload 4 files

Browse files
Files changed (4) hide show
  1. Dockerfile +35 -0
  2. README_detailed.md +63 -0
  3. app.py +9 -0
  4. requirements.txt +5 -0
Dockerfile ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install system dependencies
6
+ RUN apt-get update && \
7
+ apt-get install -y --no-install-recommends \
8
+ espeak-ng \
9
+ build-essential \
10
+ && apt-get clean \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Copy requirements first for better caching
14
+ COPY requirements.txt .
15
+ RUN pip install --no-cache-dir -r requirements.txt
16
+
17
+ # Copy application code
18
+ COPY app/ ./app/
19
+ COPY app.py .
20
+
21
+ # Create output directory
22
+ RUN mkdir -p outputs
23
+
24
+ # Set environment variables for CPU optimization
25
+ ENV PYTHONUNBUFFERED=1
26
+ ENV OMP_NUM_THREADS=1
27
+ ENV MKL_NUM_THREADS=1
28
+ ENV NUMEXPR_NUM_THREADS=1
29
+ ENV TOKENIZERS_PARALLELISM=false
30
+
31
+ # Expose port for the Gradio app
32
+ EXPOSE 7860
33
+
34
+ # Command to run the application
35
+ CMD ["python", "app.py"]
README_detailed.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Kokoro82m TTS - CPU Optimized
2
+
3
+ A fully functional Text-to-Speech application using the Kokoro82m TTS model, optimized for CPU usage on Hugging Face Spaces.
4
+
5
+ ## Features
6
+
7
+ - All US English voices from Kokoro82m (11 female, 9 male voices)
8
+ - CPU-optimized for efficient performance on limited resources
9
+ - Advanced Gradio interface with intuitive controls
10
+ - Speech speed adjustment
11
+ - Pronunciation guide support
12
+ - Real-time audio preview and download
13
+ - Example presets for quick testing
14
+
15
+ ## Technical Details
16
+
17
+ This application uses:
18
+ - Kokoro82m TTS model (82 million parameters)
19
+ - CPU-optimized PyTorch for inference
20
+ - Gradio for the user interface
21
+ - Docker for containerization
22
+
23
+ ## Usage
24
+
25
+ 1. Enter text in the input area
26
+ 2. Select a voice from the dropdown menu
27
+ 3. Adjust speech speed if desired
28
+ 4. Click "Generate Speech" to create audio
29
+ 5. Listen to the generated speech and download if needed
30
+
31
+ ## Voice Options
32
+
33
+ ### Female Voices
34
+ - Heart (Best Quality)
35
+ - Bella (High Quality)
36
+ - Nicole
37
+ - Kore
38
+ - Sarah
39
+ - Aoede
40
+ - Alloy
41
+ - Nova
42
+ - Sky
43
+ - River
44
+ - Jessica
45
+
46
+ ### Male Voices
47
+ - Fenrir
48
+ - Michael
49
+ - Puck
50
+ - Echo
51
+ - Eric
52
+ - Liam
53
+ - Onyx
54
+ - Adam
55
+ - Santa
56
+
57
+ ## Development
58
+
59
+ This application is designed to run efficiently on CPU-only environments, making it suitable for deployment on Hugging Face's free tier Spaces.
60
+
61
+ ## License
62
+
63
+ This project uses the Kokoro82m model which is available under the Apache 2.0 license.
app.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+
4
+ # Import directly from the app directory
5
+ from app.app import demo
6
+
7
+ # Launch the app
8
+ if __name__ == "__main__":
9
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ kokoro>=0.9.4
2
+ soundfile
3
+ gradio
4
+ torch --index-url https://download.pytorch.org/whl/cpu
5
+ numpy