# TTS Structure This directory contains a Text-to-Speech (TTS) implementation that supports three specific models: 1. Kokoro: https://github.com/hexgrad/kokoro 2. Dia: https://github.com/nari-labs/dia 3. CosyVoice2: https://github.com/nari-labs/dia ## Structure The TTS implementation follows a simple, clean structure: - `tts.py`: Contains the base `TTSBase` abstract class and `DummyTTS` implementation - `tts_kokoro.py`: Kokoro TTS implementation - `tts_dia.py`: Dia TTS implementation - `tts_cosyvoice2.py`: CosyVoice2 TTS implementation - `tts_main.py`: Main entry point for TTS functionality ## Usage ```python # Import the main TTS functions from utils.tts_main import generate_speech, generate_speech_stream, get_tts_engine # Generate speech using the best available engine audio_path = generate_speech("Hello, world!") # Generate speech using a specific engine audio_path = generate_speech("Hello, world!", engine_type="kokoro") # Generate speech with specific parameters audio_path = generate_speech( "Hello, world!", engine_type="dia", lang_code="en", voice="default", speed=1.0 ) # Generate speech stream for sample_rate, audio_data in generate_speech_stream("Hello, world!"): # Process audio data pass # Get a specific TTS engine instance engine = get_tts_engine("kokoro") audio_path = engine.generate_speech("Hello, world!") ``` ## Error Handling All TTS implementations include robust error handling: 1. Each implementation checks for the availability of its dependencies 2. If a specific engine fails, it automatically falls back to the `DummyTTS` implementation 3. The main module prioritizes engines based on availability ## Adding New Engines To add a new TTS engine: 1. Create a new file `tts_.py` 2. Implement a class that inherits from `TTSBase` 3. Add the engine to the available engines list in `tts_main.py`