Spaces:
Runtime error
Runtime error
File size: 2,450 Bytes
099ceaf 23d6a6e 1734104 099ceaf ce25b23 099ceaf 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e ce25b23 23d6a6e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
title: LFM2-Audio Real-time Speech-to-Speech
emoji: ποΈ
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
license: other
---
# LFM2-Audio Real-time Speech-to-Speech Chat
Real-time WebRTC streaming demo of LFM2-Audio-1.5B, Liquid AI's first end-to-end audio foundation model.
## β¨ Features
- **π΄ Real-time WebRTC streaming** - Instant response with minimal latency
- **ποΈ Continuous listening** - Natural conversation flow with automatic pause detection
- **π¬ Interleaved output** - Simultaneous text and audio generation
- **π Multi-turn memory** - Context-aware conversations
- **β‘ Low latency** - Optimized for real-time interaction
## π How to Use
1. **Grant microphone access** when prompted by your browser
2. **Start speaking** - The model listens continuously
3. **Pause briefly** - The model detects pauses and responds automatically
4. **Continue conversation** - Build multi-turn dialogues naturally
## ποΈ Parameters
### Temperature
- **0**: Greedy decoding (most deterministic)
- **1.0**: Default (balanced creativity and coherence)
- **2.0**: Maximum creativity (more diverse outputs)
### Top-k
- **0**: No filtering (full vocabulary)
- **4**: Default (conservative, high quality)
- **Higher values**: More diverse but potentially less coherent
## ποΈ Technical Details
- **Model**: LFM2-Audio-1.5B
- **Generation Mode**: Interleaved (optimized for real-time)
- **Audio Codec**: Mimi (24kHz)
- **Streaming**: WebRTC via fastrtc
- **Backend**: PyTorch with CUDA acceleration
## π§ Differences from Standard Demo
This demo uses **fastrtc** for WebRTC streaming, enabling:
- Continuous audio streaming without manual recording
- Automatic voice activity detection (VAD)
- Lower latency through chunked processing
- More natural conversation flow
## π Resources
- [Liquid AI Website](https://www.liquid.ai/)
- [GitHub Repository](https://github.com/Liquid4All/liquid-audio/)
- [Model on Hugging Face](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B)
- [fastrtc Documentation](https://github.com/freddyaboulton/fastrtc)
## π License
Licensed under the LFM Open License v1.0
## π‘ Tips
- Speak clearly and pause briefly between thoughts
- Use a good quality microphone for best results
- Adjust temperature for different creativity levels
- Lower top-k values produce more consistent responses
- GPU acceleration is recommended for real-time performance
|