viviztech commited on
Commit
af41bb2
·
verified ·
1 Parent(s): fa83b31

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +66 -6
  2. app.py +57 -0
  3. requirements.txt +6 -0
README.md CHANGED
@@ -1,12 +1,72 @@
1
  ---
2
- title: Kani Tts Web App
3
- emoji: 🦀
4
- colorFrom: purple
5
- colorTo: green
6
  sdk: gradio
7
- sdk_version: 6.0.2
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: KaniTTS Web App
3
+ emoji: 🎤
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
  ---
12
 
13
+ # KaniTTS Web App
14
+
15
+ A text-to-speech web application using KaniTTS model with Gradio interface.
16
+
17
+ ## Prerequisites
18
+
19
+ - Python 3.10 or higher
20
+ - PyTorch 2.8.0 or higher (requires Linux or Apple Silicon Mac)
21
+
22
+ > **Note**: This app requires PyTorch 2.8+. macOS x86_64 (Intel) only supports up to PyTorch 2.2.2. For Intel Macs, please use a cloud platform.
23
+
24
+ ## Cloud Deployment
25
+
26
+ ### Google Colab
27
+ 1. Upload the files to Google Drive or clone from GitHub
28
+ 2. Run: `!pip install -r requirements.txt`
29
+ 3. Run: `!python app.py`
30
+
31
+ ### Hugging Face Spaces
32
+ 1. Create a new Space with Gradio SDK
33
+ 2. Upload `app.py` and `requirements.txt`
34
+ 3. The app will automatically deploy
35
+
36
+ ### Replit
37
+ 1. Create a new Python Repl
38
+ 2. Upload the files
39
+ 3. Install dependencies: `pip install -r requirements.txt`
40
+ 4. Run: `python app.py`
41
+
42
+ ## Local Installation (Linux/Apple Silicon)
43
+
44
+ ```bash
45
+ # Clone the repository
46
+ git clone <your-repo-url>
47
+ cd tts-app
48
+
49
+ # Create virtual environment
50
+ python3 -m venv venv
51
+ source venv/bin/activate
52
+
53
+ # Install dependencies
54
+ pip install -r requirements.txt
55
+
56
+ # Run the application
57
+ python app.py
58
+ ```
59
+
60
+ ## Usage
61
+
62
+ 1. Open your browser and go to: `http://127.0.0.1:7860`
63
+ 2. Enter the text you want to convert to speech
64
+ 3. Click the "Generate" button
65
+ 4. Listen to the generated audio
66
+
67
+ ## Model Information
68
+
69
+ - **Model**: `nineninesix/kani-tts-400m-en`
70
+ - **Sample Rate**: 22050Hz (model default)
71
+ - **Language**: English
72
+
app.py ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from kani_tts import KaniTTS
3
+ import soundfile as sf
4
+ import numpy as np
5
+ import tempfile
6
+ import os
7
+
8
+ # Initialize the KaniTTS model
9
+ print("Loading KaniTTS model...")
10
+ model = KaniTTS('nineninesix/kani-tts-400m-en')
11
+ print("Model loaded successfully!")
12
+
13
+ def generate_speech(text):
14
+ """Generate speech from text using KaniTTS model."""
15
+ if not text or not text.strip():
16
+ return None
17
+
18
+ try:
19
+ # Generate audio from text
20
+ audio, _ = model(text)
21
+
22
+ # Get sample rate from model (typically 22050Hz or 24000Hz)
23
+ sample_rate = model.sample_rate
24
+
25
+ # Save to temporary file
26
+ temp_file = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
27
+ sf.write(temp_file.name, audio, sample_rate)
28
+
29
+ return temp_file.name
30
+ except Exception as e:
31
+ print(f"Error generating speech: {e}")
32
+ return None
33
+
34
+ # Create Gradio interface
35
+ with gr.Blocks(title="KaniTTS Web App") as demo:
36
+ gr.Markdown("# KaniTTS Web App")
37
+ gr.Markdown("Enter text below and click Generate to convert it to speech.")
38
+
39
+ with gr.Row():
40
+ text_input = gr.Textbox(
41
+ label="Text Input",
42
+ placeholder="Enter text to convert to speech...",
43
+ lines=3
44
+ )
45
+
46
+ generate_btn = gr.Button("Generate", variant="primary")
47
+
48
+ audio_output = gr.Audio(label="Generated Audio", type="filepath")
49
+
50
+ generate_btn.click(
51
+ fn=generate_speech,
52
+ inputs=text_input,
53
+ outputs=audio_output
54
+ )
55
+
56
+ if __name__ == "__main__":
57
+ demo.launch(server_port=7860)
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ kani-tts>=0.0.4
2
+ gradio>=4.0.0
3
+ torch>=2.8.0
4
+ soundfile>=0.12.0
5
+ numpy>=1.24.0
6
+ transformers>=4.57.0