Running on Zero Featured 1.36k Qwen3-TTS Demo 🎙 1.36k Generate speech from text with voice design, cloning, or speakers
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published Feb 18, 2025 • 86 • 5
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Paper • 2503.01710 • Published Mar 3, 2025 • 6
Llasa Collection TTS foundation model compatible with Llama framework (160k hours tokenized speech data released) • 11 items • Updated May 11, 2025 • 20
view article Article From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages Feb 11, 2025 • 33
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis Paper • 2502.04128 • Published Feb 6, 2025 • 27
Autoregressive Speech Synthesis with Next-Distribution Prediction Paper • 2412.16846 • Published Dec 22, 2024 • 1
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 87
Running on Zero Featured 2.7k Whisper 📉 2.7k Transcribe audio or YouTube videos into text with Whisper