File size: 1,554 Bytes
458e699
e1c7ef8
 
458e699
 
 
 
 
e1c7ef8
458e699
 
e1c7ef8
e50d013
e1c7ef8
e50d013
 
 
 
 
 
 
315ec3a
e50d013
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
315ec3a
 
 
 
 
6c4b49c
8829e6c
315ec3a
e50d013
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
title:  TTS Galary
emoji: 📣
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.44.1
app_file: app.py
pinned: true
---

# TTS Galary

This demo showcases the multilingual capabilities of multiple TTS models, supporting both English and Chinese languages.

## Features

- Text-to-speech generation for English and Chinese
- Gradio web interface for easy interaction
- Real-time audio generation and playback
- Example texts for quick testing
- Support for multiple TTS architectures including seq2seq models

## Requirements

- Python 3.8 or higher
- Required Python packages (automatically installed by Hugging Face):
  - chatterbox-tts
  - gradio
  - torchaudio
  - torch

## Usage

1. Enter text in the input box
2. Select the language (English or Chinese)
3. Click "Generate Speech"
4. Listen to the generated audio

## Supported Languages

- English
- Chinese

## Supported Models

- **Chatterbox**: Industrial-grade multilingual TTS solution
- **KittenTTS**: High-quality TTS with voice cloning capabilities
- **Piper**: Local on-device TTS with multiple voice options
- **Faster Whisper**: High-performance speech recognition model for audio transcription
- **Kokoro**: Lightweight TTS model with 82M parameters, Apache-licensed for production and personal use

## Examples

The interface includes example texts for both languages to help you get started quickly.

## Notes

- The first generation may take a moment as the model loads
- Subsequent generations will be faster
- For best results, use clear and properly punctuated text