Spaces:

Rcarvalo
/

speech-to-speech

Runtime error

App Files Files Community

speech-to-speech / README.md

Rcarvalo's picture

Upload README.md with huggingface_hub

23d6a6e verified 8 days ago

|

2.45 kB

metadata

title: LFM2-Audio Real-time Speech-to-Speech
emoji: 🎙️
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
license: other

LFM2-Audio Real-time Speech-to-Speech Chat

Real-time WebRTC streaming demo of LFM2-Audio-1.5B, Liquid AI's first end-to-end audio foundation model.

✨ Features

🔴 Real-time WebRTC streaming - Instant response with minimal latency
🎙️ Continuous listening - Natural conversation flow with automatic pause detection
💬 Interleaved output - Simultaneous text and audio generation
🔄 Multi-turn memory - Context-aware conversations
⚡ Low latency - Optimized for real-time interaction

🚀 How to Use

Grant microphone access when prompted by your browser
Start speaking - The model listens continuously
Pause briefly - The model detects pauses and responds automatically
Continue conversation - Build multi-turn dialogues naturally

🎛️ Parameters

Temperature

0: Greedy decoding (most deterministic)
1.0: Default (balanced creativity and coherence)
2.0: Maximum creativity (more diverse outputs)

Top-k

0: No filtering (full vocabulary)
4: Default (conservative, high quality)
Higher values: More diverse but potentially less coherent

🏗️ Technical Details

Model: LFM2-Audio-1.5B
Generation Mode: Interleaved (optimized for real-time)
Audio Codec: Mimi (24kHz)
Streaming: WebRTC via fastrtc
Backend: PyTorch with CUDA acceleration

🔧 Differences from Standard Demo

This demo uses fastrtc for WebRTC streaming, enabling:

Continuous audio streaming without manual recording
Automatic voice activity detection (VAD)
Lower latency through chunked processing
More natural conversation flow

📚 Resources

📝 License

Licensed under the LFM Open License v1.0

💡 Tips

Speak clearly and pause briefly between thoughts
Use a good quality microphone for best results
Adjust temperature for different creativity levels
Lower top-k values produce more consistent responses
GPU acceleration is recommended for real-time performance