Spaces:
Running
Running
| title: Moltbot Body | |
| emoji: π€ | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: static | |
| pinned: false | |
| short_description: Give Moltbot a physical presence with Reachy Mini | |
| tags: | |
| - reachy_mini | |
| - reachy_mini_python_app | |
| - clawdbot | |
| - moltbot | |
| # Moltbot's Body | |
| > **Security Warning**: This project uses Moltbot, which runs AI-generated code with access to your system. Ensure you understand the security implications before installation. Only run Moltbot from trusted sources and review its permissions carefully. See the [Moltbot Security documentation](https://docs.molt.bot/gateway/security) for details. | |
| Reachy Mini integration with Moltbot β giving Moltbot a physical presence. | |
| ## What is Moltbot? | |
| [Moltbot](https://docs.molt.bot/start/getting-started) is an AI assistant platform that can connect to various chat surfaces (WhatsApp, Telegram, Discord, etc.) and execute tasks autonomously. This project extends Moltbot by giving it a physical robot body using [Reachy Mini](https://huggingface.co/spaces/pollen-robotics/Reachy_Mini), a small expressive robot from Pollen Robotics. | |
| With this integration, Moltbot can: | |
| - Listen to speech via the robot's microphone | |
| - Transcribe speech locally using Whisper | |
| - Generate responses through the Moltbot gateway | |
| - Speak responses through ElevenLabs TTS | |
| - Move its head expressively while speaking | |
| ## Architecture | |
| ``` | |
| Microphone β VAD β Whisper STT β Moltbot Gateway β ElevenLabs TTS β Speaker | |
| β | |
| MovementManager | |
| HeadWobbler (speech-driven head movement) | |
| ``` | |
| ## Prerequisites | |
| Before running this project, you need: | |
| ### 1. Moltbot Gateway (Required) | |
| Moltbot must be installed and the gateway must be running. Follow the [Moltbot Getting Started guide](https://docs.molt.bot/start/getting-started) to: | |
| 1. Install the CLI: `curl -fsSL https://molt.bot/install.sh | bash` | |
| 2. Run the onboarding wizard: `moltbot onboard --install-daemon` | |
| 3. Start the gateway: `moltbot gateway --port 18789` | |
| Verify it's running: | |
| ```bash | |
| moltbot gateway status | |
| ``` | |
| ### 2. Reachy Mini Robot (Required) | |
| You need a [Reachy Mini](https://huggingface.co/spaces/pollen-robotics/Reachy_Mini) robot from Pollen Robotics with its daemon running. | |
| Verify the daemon is running: | |
| ```bash | |
| curl -s http://localhost:8000/api/daemon/status | jq .state | |
| ``` | |
| ### 3. ElevenLabs Account (Required) | |
| Sign up at [ElevenLabs](https://elevenlabs.io/) and get an API key for text-to-speech. | |
| ### 4. Python 3.12+ and uv | |
| This project requires Python 3.12 or later and uses [uv](https://docs.astral.sh/uv/) for package management. | |
| ## Setup | |
| ```bash | |
| git clone <this-repo> | |
| cd reachy | |
| uv sync | |
| ``` | |
| ### Environment Variables | |
| Create a `.env` file: | |
| ```bash | |
| CLAWDBOT_TOKEN=your_gateway_token | |
| ELEVENLABS_API_KEY=your_elevenlabs_key | |
| ``` | |
| Get your gateway token from the Moltbot configuration, or these will be pulled from the Moltbot config automatically if not set. | |
| ## Running | |
| ```bash | |
| # Make sure Reachy Mini daemon is running | |
| curl -s http://localhost:8000/api/daemon/status | jq .state | |
| # Make sure Moltbot gateway is running | |
| moltbot gateway status | |
| # Start Moltbot's body | |
| uv run moltbot-body | |
| ``` | |
| ## CLI Options | |
| | Flag | Description | | |
| |------|-------------| | |
| | `--debug` | Enable debug logging (verbose output) | | |
| | `--profile` | Enable timing profiler - prints detailed timing breakdown after each conversation turn | | |
| | `--profile-once` | Profile one conversation turn then exit (useful for benchmarking) | | |
| | `--robot-name NAME` | Specify robot name for connection (if you have multiple robots) | | |
| | `--gateway-url URL` | Moltbot gateway URL (default: `http://localhost:18789`) | | |
| ### Examples | |
| ```bash | |
| # Run with debug logging | |
| uv run moltbot-body --debug | |
| # Profile a single conversation turn | |
| uv run moltbot-body --profile-once | |
| # Connect to a specific robot and gateway | |
| uv run moltbot-body --robot-name my-reachy --gateway-url http://192.168.1.100:18789 | |
| ``` | |
| ### Profiling Output | |
| When using `--profile` or `--profile-once`, you'll see a detailed timing breakdown after each turn: | |
| ``` | |
| ============================================================ | |
| CONVERSATION TIMING PROFILE | |
| ============================================================ | |
| π User: "Hello, how are you?" | |
| π€ Assistant: "I'm doing well, thank you for asking!" | |
| ------------------------------------------------------------ | |
| TIMING BREAKDOWN | |
| ------------------------------------------------------------ | |
| π€ Speech Detection: | |
| Duration spoken: 1.23s | |
| π Whisper Transcription: | |
| Time: 0.45s | |
| π§ LLM (Moltbot): | |
| Time to first token: 0.32s | |
| Streaming time: 1.15s | |
| Total time: 1.47s | |
| Tokens: 42 (36.5 tok/s) | |
| π TTS (ElevenLabs): | |
| Time to first audio: 0.28s | |
| Total streaming: 1.82s | |
| Audio chunks: 15 | |
| ------------------------------------------------------------ | |
| END-TO-END LATENCY | |
| ------------------------------------------------------------ | |
| β±οΈ Speech end β First audio: 1.05s | |
| β±οΈ Total turn time: 4.50s | |
| ============================================================ | |
| ``` | |
| ## Features | |
| - **Voice Activation**: Listens for speech, processes when silence detected | |
| - **Whisper STT**: Local speech-to-text transcription using faster-whisper | |
| - **Moltbot Brain**: Claude-powered responses via the Moltbot gateway API | |
| - **ElevenLabs TTS**: Natural voice output with streaming | |
| - **Head Wobble**: Audio-driven head movement while speaking for natural expressiveness | |
| - **Movement Manager**: 100Hz control loop for smooth robot motion | |
| - **Breathing Animation**: Gentle idle breathing when not actively engaged | |
| ## Tips for a Better Experience | |
| ### Use a Low-Latency Inference Provider | |
| For natural, conversational interactions, response latency is critical. The time from when you stop speaking to when the robot starts responding should ideally be under 1 second. | |
| Consider using a fast inference provider like [Groq](https://groq.com/) which offers extremely low latency for supported models. You can configure this in your Moltbot settings. Use the `--profile` flag to measure your end-to-end latency and identify bottlenecks. | |
| ### Let Moltbot Help You Set Up | |
| Since Moltbot is an AI coding assistant, you can chat with it to help configure and customize the robot body! Try asking Moltbot (via any of its chat surfaces) to: | |
| - Help you tune the head movement parameters | |
| - Adjust the voice activation sensitivity | |
| - Add new expressions or gestures | |
| - Debug connection issues | |
| Moltbot can read and modify this codebase, so it's a great collaborator for extending the robot's capabilities. | |
| ## Roadmap | |
| - [ ] Face tracking (look at the person speaking) | |
| - [ ] DoA-based head tracking (direction of arrival for speaker localization) | |
| - [ ] Wake word detection | |
| - [ ] Expression gestures | |
| ## License | |
| MIT License - see [LICENSE](LICENSE) for details. | |