Transcribe audio to text with timestamps and visualization
Generate custom speech from text, voice description, or audio