Spaces:
Sleeping
Sleeping
| title: ZeroGPU TTS Service | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.42.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| suggested_hardware: zero-a10g | |
| # π ZeroGPU Text-to-Speech Service | |
| A high-performance text-to-speech service powered by **Hugging Face ZeroGPU** and **Nvidia H200** dynamic resources. | |
| ## π― Features | |
| - π **ZeroGPU Acceleration**: Dynamic H200 GPU allocation | |
| - β‘ **Ultra-Fast Generation**: Optimized for latest GPU architecture | |
| - π **Multiple Voice Presets**: 10 different voice characteristics | |
| - π **High-Quality Audio**: Professional-grade speech synthesis | |
| - π¦ **Batch Processing**: Multiple texts in parallel | |
| - π **Dual Protocol Support**: Gradio Web UI + MCP Protocol | |
| - π **MCP Integration**: Compatible with AI assistants (Claude Code, etc.) | |
| - π° **Cost Efficient**: No idle costs with Pro subscription | |
| ## ποΈ Architecture | |
| - **Backend**: Transformers + PyTorch with CUDA optimization | |
| - **Frontend**: Gradio with enhanced UI | |
| - **GPU**: ZeroGPU with H200 dynamic scaling | |
| - **Model**: Bark (suno/bark-small) with mixed precision | |
| ## π Performance | |
| - **Single synthesis**: 0.5-2 seconds (depending on text length) | |
| - **Batch processing**: Parallel execution on H200 | |
| - **Memory efficient**: Automatic GPU cleanup | |
| - **Scaling**: Dynamic resource allocation | |
| ## π» API Usage | |
| ### Gradio Client API | |
| ```python | |
| from gradio_client import Client | |
| client = Client("YOUR_USERNAME/tts-gpu-service") | |
| result = client.predict( | |
| "Hello from ZeroGPU!", | |
| "v2/en_speaker_6", | |
| api_name="/predict" | |
| ) | |
| audio_file, status = result | |
| ``` | |
| ### MCP Protocol API | |
| ```python | |
| from mcp import ClientSession, StdioServerParameters | |
| from mcp.client.stdio import stdio_client | |
| async def use_tts(): | |
| server_params = StdioServerParameters( | |
| command="python", | |
| args=["app.py", "--mcp-only"] | |
| ) | |
| async with stdio_client(server_params) as (read, write): | |
| async with ClientSession(read, write) as session: | |
| await session.initialize() | |
| result = await session.call_tool("tts_synthesize", { | |
| "text": "Hello from MCP!", | |
| "voice_preset": "v2/en_speaker_6" | |
| }) | |
| ``` | |
| ## π MCP Tools | |
| - **`tts_synthesize`**: Convert single text to speech | |
| - **`tts_batch_synthesize`**: Convert multiple texts to speech | |
| - **`tts_get_info`**: Get system status and capabilities | |
| ## π Running Modes | |
| ### Dual Mode (Default) | |
| ```bash | |
| python app.py | |
| ``` | |
| - Gradio UI: http://localhost:7860 | |
| - MCP Server: Available on stdio | |
| ### MCP-Only Mode | |
| ```bash | |
| python app.py --mcp-only | |
| ``` | |
| - For integration with AI assistants | |
| - No web interface, only MCP protocol | |