Spaces:
Running
title: TARS Conversation App
emoji: π€
colorFrom: blue
colorTo: purple
sdk: static
short_description: Real-time AI voice assistant for TARS
pinned: false
TARS Conversation App
Real-time voice AI with transcription, vision, and intelligent conversation using Speechmatics/Deepgram, Qwen3-TTS/ElevenLabs, DeepInfra LLM, and Moondream.
Features
- Dual Operation Modes
- WebRTC Mode (
src/bot.py) - Browser-based voice AI with real-time metrics dashboard - Robot Mode (
src/tars_bot.py) - Connect to Raspberry Pi TARS robot via WebRTC and gRPC
- WebRTC Mode (
- Real-time Transcription - Speechmatics or Deepgram with smart turn detection
- Dual TTS Options - Qwen3-TTS (local, free, voice cloning) or ElevenLabs (cloud)
- LLM Integration - Any model via DeepInfra
- Vision Analysis - Moondream for image understanding
- Smart Gating Layer - AI-powered decision system for natural conversation flow
- Hybrid Memory - SQLite-based hybrid search (70% vector + 30% BM25)
- Emotional Monitoring - Real-time detection of confusion, hesitation, and frustration
- Gradio Dashboard - Live TTFB metrics, latency charts, and conversation transcription
- WebRTC Transport - Low-latency peer-to-peer audio
- gRPC Robot Control - Hardware control with 5-10ms latency (robot mode only)
Project Structure
tars-conversation-app/
βββ src/bot.py # WebRTC mode - Browser voice AI
βββ src/tars_bot.py # Robot mode - Raspberry Pi hardware
βββ src/pipecat_service.py # FastAPI backend (WebRTC signaling)
βββ config.py # Configuration management
βββ config.ini # User configuration file
βββ requirements.txt # Python dependencies
β
βββ src/ # Backend
β βββ observers/ # Pipeline observers (metrics, transcription)
β βββ processors/ # Pipeline processors (silence filter, gating)
β βββ services/ # Services (STT, TTS, Memory, Robot)
β βββ tools/ # LLM callable functions
β βββ transport/ # WebRTC transport (aiortc)
β βββ character/ # TARS personality and prompts
β βββ shared_state.py # Shared metrics storage
β
βββ ui/ # Frontend
β βββ app.py # Gradio dashboard (metrics + transcription)
β
βββ tests/ # Tests
β βββ gradio/
β βββ test_gradio.py # UI integration test
β
βββ character/ # TARS character data
β βββ TARS.json # Character definition
β βββ persona.ini # Personality parameters
Operation Modes
WebRTC Mode (src/bot.py)
- Use case: Browser-based voice AI conversations
- Transport: SmallWebRTC (browser β Pipecat)
- Features: Full pipeline with STT, LLM, TTS, Memory
- UI: Gradio dashboard for metrics and transcription
- Best for: Development, testing, remote conversations
Robot Mode (src/tars_bot.py)
- Use case: Physical TARS robot on Raspberry Pi
- Transport: aiortc (RPi β Pipecat) + gRPC (commands)
- Features: Same pipeline + robot control (eyes, gestures, movement)
- Hardware: Requires TARS robot with servos and display
- Best for: Physical robot interactions, demos
Quick Start
Installation on TARS Robot (Recommended)
Install directly from HuggingFace Space via the TARS dashboard:
- Open TARS dashboard at
http://your-pi:8000 - Go to App Store tab
- Enter Space ID:
latishab/tars-conversation-app - Click Install from HuggingFace
- Configure API keys in
.env.local - Click Start
- Access metrics dashboard at
http://your-pi:7860
The app will:
- Auto-install dependencies
- Set up virtual environment
- Configure for robot mode
- Start Gradio dashboard
Easy Installation (Manual)
For first-time setup on Raspberry Pi:
# Clone and install
git clone https://github.com/latishab/tars-conversation-app.git
cd tars-conversation-app
bash install.sh
The installer handles:
- System dependencies (portaudio, ffmpeg)
- Python virtual environment
- All Python packages
- Configuration file setup
Manual Installation
# Python dependencies
pip install -r requirements.txt
# For robot mode, install TARS SDK
pip install tars-robot[sdk]
2. Configure Environment
# Copy and edit environment file with your API keys
cp env.example .env.local
# Copy and edit configuration file
cp config.ini.example config.ini
Required API Keys (in .env.local):
SPEECHMATICS_API_KEYorDEEPGRAM_API_KEY- For speech-to-textDEEPINFRA_API_KEY- For LLMELEVENLABS_API_KEY- Optional (if using ElevenLabs TTS)
Settings (in config.ini):
[LLM]
model = meta-llama/Llama-3.3-70B-Instruct
[STT]
provider = deepgram # or speechmatics
[TTS]
provider = qwen3 # or elevenlabs
[Memory]
type = hybrid # SQLite-based hybrid search (vector + BM25)
3. Run
WebRTC Mode (Browser)
Terminal 1: Python backend
python src/pipecat_service.py
Terminal 2: Gradio UI (optional)
python ui/app.py
Then:
- Open WebRTC client in browser (connect to pipecat_service)
- Open Gradio dashboard at http://localhost:7861 (for metrics)
- Start talking
Robot Mode (Raspberry Pi)
Prerequisites:
- Raspberry Pi TARS robot running tars_daemon.py
- Network connection (LAN or Tailscale)
- TARS SDK installed
Configuration in config.ini:
[Connection]
mode = robot
rpi_url = http://<your-rpi-ip>:8001
rpi_grpc = <your-rpi-ip>:50051
auto_connect = true
[Display]
enabled = true
Deployment detection:
- Remote (Mac/computer): Uses configured addresses
- Local (on RPi): Auto-detects localhost:50051
Run:
python src/tars_bot.py
Gradio Dashboard
The Gradio UI (ui/app.py) provides real-time monitoring:
Latency Dashboard
- Service configuration (STT, Memory, LLM, TTS)
- TTFB metrics with min/max/avg/last stats
- Line chart: Latency trends over time
- Bar chart: Stacked latency breakdown
- Metrics table: Last 15 turns
Conversation Tab
- Live user and assistant transcriptions
- Auto-updates every second
Connection Tab
- Architecture documentation
- Usage instructions
Architecture
WebRTC Mode Data Flow
Browser (WebRTC client)
β (audio)
SmallWebRTC Transport
β
Pipeline: STT β Memory β LLM β TTS
β
Observers (metrics, transcription, assistant)
β
shared_state.py
β
Gradio UI (http://localhost:7861)
Robot Mode Data Flow
RPi Mic β WebRTC β Pipecat Pipeline β WebRTC β RPi Speaker
(audio) β (audio)
STT β Memory β LLM β TTS
β
LLM Tools (set_emotion, do_gesture)
β
gRPC β RPi Hardware
(eyes, servos, display)
Communication channels (Robot Mode):
| Channel | Protocol | Purpose | Latency |
|---|---|---|---|
| Audio | WebRTC (aiortc) | Voice conversation | ~20ms |
| Commands | gRPC | Hardware control | ~5-10ms |
| State | DataChannel | Battery, movement status | ~10ms |
Development
See docs/DEVELOPING_APPS.md for comprehensive guide on creating TARS SDK apps.
Adding Metrics
- Emit
MetricsFramein your service/processor MetricsObserverwill capture it automatically- Metrics appear in Gradio dashboard
Adding Tools
- Create function in
src/tools/ - Create schema with
create_*_schema() - Register in
src/bot.pyorsrc/tars_bot.py - LLM can now call your tool
Modifying UI
- Edit
ui/app.py - Gradio hot-reloads automatically
- Access
metrics_storefor data
Uninstalling
bash uninstall.sh
Removes virtual environment and optionally data/config files.
Troubleshooting
No metrics in Gradio UI
- Ensure bot is running (
src/bot.pyorsrc/tars_bot.py) - Check WebRTC client is connected
- Verify at least one conversation turn completed
Robot mode connection issues
- Check RPi is reachable:
ping <rpi-ip> - Verify tars_daemon is running on RPi
- Check gRPC port 50051 is open
- Review config.ini addresses
Import errors
pip install -r requirements.txt
pip install gradio plotly # For UI
Audio issues (robot mode)
- Check RPi mic/speaker with
arecord/aplay - Verify WebRTC connection in logs
- Test with
tests/test_hardware.py
Contributing
Contributions welcome.
- Fork the repository
- Create a feature branch
- Make your changes
- Test with
python tests/gradio/test_gradio.py - Commit with clear messages (see CLAUDE.md for style)
- Push to your fork
- Open a Pull Request
Code Style:
- Python: Follow PEP 8
- Add comments for complex logic
- Update docs for new features
- See CLAUDE.md for guidelines (concise, technical, no fluff)
License
MIT License - see LICENSE file for details