Spaces:
Runtime error
Runtime error
| title: LFM2-Audio Real-time Speech-to-Speech | |
| emoji: ποΈ | |
| colorFrom: purple | |
| colorTo: pink | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: other | |
| # LFM2-Audio Real-time Speech-to-Speech Chat | |
| Real-time WebRTC streaming demo of LFM2-Audio-1.5B, Liquid AI's first end-to-end audio foundation model. | |
| ## β¨ Features | |
| - **π΄ Real-time WebRTC streaming** - Instant response with minimal latency | |
| - **ποΈ Continuous listening** - Natural conversation flow with automatic pause detection | |
| - **π¬ Interleaved output** - Simultaneous text and audio generation | |
| - **π Multi-turn memory** - Context-aware conversations | |
| - **β‘ Low latency** - Optimized for real-time interaction | |
| ## π How to Use | |
| 1. **Grant microphone access** when prompted by your browser | |
| 2. **Start speaking** - The model listens continuously | |
| 3. **Pause briefly** - The model detects pauses and responds automatically | |
| 4. **Continue conversation** - Build multi-turn dialogues naturally | |
| ## ποΈ Parameters | |
| ### Temperature | |
| - **0**: Greedy decoding (most deterministic) | |
| - **1.0**: Default (balanced creativity and coherence) | |
| - **2.0**: Maximum creativity (more diverse outputs) | |
| ### Top-k | |
| - **0**: No filtering (full vocabulary) | |
| - **4**: Default (conservative, high quality) | |
| - **Higher values**: More diverse but potentially less coherent | |
| ## ποΈ Technical Details | |
| - **Model**: LFM2-Audio-1.5B | |
| - **Generation Mode**: Interleaved (optimized for real-time) | |
| - **Audio Codec**: Mimi (24kHz) | |
| - **Streaming**: WebRTC via fastrtc | |
| - **Backend**: PyTorch with CUDA acceleration | |
| ## π§ Differences from Standard Demo | |
| This demo uses **fastrtc** for WebRTC streaming, enabling: | |
| - Continuous audio streaming without manual recording | |
| - Automatic voice activity detection (VAD) | |
| - Lower latency through chunked processing | |
| - More natural conversation flow | |
| ## π Resources | |
| - [Liquid AI Website](https://www.liquid.ai/) | |
| - [GitHub Repository](https://github.com/Liquid4All/liquid-audio/) | |
| - [Model on Hugging Face](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B) | |
| - [fastrtc Documentation](https://github.com/freddyaboulton/fastrtc) | |
| ## π License | |
| Licensed under the LFM Open License v1.0 | |
| ## π‘ Tips | |
| - Speak clearly and pause briefly between thoughts | |
| - Use a good quality microphone for best results | |
| - Adjust temperature for different creativity levels | |
| - Lower top-k values produce more consistent responses | |
| - GPU acceleration is recommended for real-time performance | |