Spaces:
Running
Running
| title: Reachy Mini Daedalus | |
| emoji: π§ββοΈ | |
| colorFrom: red | |
| colorTo: blue | |
| sdk: static | |
| pinned: false | |
| short_description: Voice conversation app for Reachy Mini | |
| tags: | |
| - reachy_mini | |
| - reachy_mini_python_app | |
| # Reachy Mini Daedalus | |
| A real-time voice conversation application for the Reachy Mini robot, powered by OpenAI's Realtime API. The robot can listen, respond with speech, use its camera to see, perform dances, express emotions, and track heads. | |
| ## Features | |
| - **Real-time Voice Conversations** β Bidirectional audio streaming with OpenAI's GPT Realtime model | |
| - **Vision Capabilities** β Camera integration with optional local vision processing or GPT-realtime vision | |
| - **Head Tracking** β YOLO or MediaPipe-based head detection and tracking | |
| - **Expressive Behaviors** β Dance moves, emotional expressions, and head movements | |
| - **Personality Profiles** β Switchable robot personalities (detective, scientist, butler, and more) | |
| - **Dual Interface** β Gradio web UI or headless console mode | |
| - **Head Wobble** β Subtle head movements synchronized with speech | |
| ## Requirements | |
| - Python 3.12+ | |
| - [uv](https://docs.astral.sh/uv/) (recommended) or pip | |
| - Reachy Mini robot (or simulator) | |
| - OpenAI API key with Realtime API access | |
| ## Setup | |
| ### 1. Clone the repository | |
| ```bash | |
| git clone <repository-url> | |
| cd reachy_mini_daedalus | |
| ``` | |
| ### 2. Install uv (if not already installed) | |
| ```bash | |
| # macOS/Linux | |
| curl -LsSf https://astral.sh/uv/install.sh | sh | |
| # Or via Homebrew | |
| brew install uv | |
| ``` | |
| ### 3. Create virtual environment and install dependencies | |
| ```bash | |
| # Create and activate a virtual environment | |
| uv venv --python 3.12.1 # Create a virtual environment with Python 3.12.1 | |
| source .venv/bin/activate | |
| uv sync | |
| ``` | |
| > [!NOTE] | |
| > To reproduce the exact dependency set from this repo's `uv.lock`, run `uv sync` with `--locked` (or `--frozen`). This ensures `uv` installs directly from the lockfile without re-resolving or updating any versions. | |
| To include optional dependencies: | |
| ``` | |
| uv sync --extra reachy_mini_wireless # For wireless Reachy Mini with GStreamer support | |
| uv sync --extra local_vision # For local PyTorch/Transformers vision | |
| uv sync --extra yolo_vision # For YOLO-based vision | |
| uv sync --extra mediapipe_vision # For MediaPipe-based vision | |
| uv sync --extra all_vision # For all vision features | |
| uv sync --extra all # For all features | |
| ``` | |
| You can combine extras or include dev dependencies: | |
| ``` | |
| uv sync --extra all_vision --group dev | |
| ``` | |
| ``` | |
| ### 4. Configure environment variables | |
| Create a `.env` file in the project root: | |
| ```bash | |
| # Required | |
| OPENAI_API_KEY=sk-your-api-key-here | |
| # Optional | |
| MODEL_NAME=gpt-realtime # OpenAI model (default: gpt-realtime) | |
| REACHY_MINI_CUSTOM_PROFILE= # Personality profile name | |
| HF_HOME=./cache # Hugging Face cache directory | |
| LOCAL_VISION_MODEL=HuggingFaceTB/SmolVLM2-2.2B-Instruct # Local vision model | |
| HF_TOKEN= # Hugging Face token (optional) | |
| ``` | |
| ## Usage | |
| ### Run with Gradio UI | |
| ```bash | |
| python -m reachy_mini_daedalus.main --gradio | |
| ``` | |
| Then open http://localhost:7860 in your browser. | |
| ### Run in headless console mode | |
| ```bash | |
| python -m reachy_mini_daedalus.main | |
| ``` | |
| ### Command Line Options | |
| | Flag | Description | | |
| |------|-------------| | |
| | `--gradio` | Launch with Gradio web interface | | |
| | `--head-tracker {yolo,mediapipe}` | Enable head tracking with specified backend | | |
| | `--no-camera` | Disable camera usage | | |
| | `--local-vision` | Use local vision model instead of GPT-realtime vision | | |
| | `--wireless-version` | Use WebRTC backend for wireless Reachy Mini | | |
| | `--on-device` | Run on the same device as Reachy Mini daemon | | |
| | `--debug` | Enable debug logging | | |
| ### Examples | |
| ```bash | |
| # Gradio UI with MediaPipe head tracking | |
| python -m reachy_mini_daedalus.main --gradio --head-tracker mediapipe | |
| # Headless mode with local vision processing | |
| python -m reachy_mini_daedalus.main --local-vision | |
| # Wireless robot with WebRTC | |
| python -m reachy_mini_daedalus.main --gradio --wireless-version | |
| ``` | |
| ## Personality Profiles | |
| The robot can take on different personalities. Available profiles are located in `reachy_mini_daedalus/profiles/`: | |
| - `short_noir_detective` β Film noir detective persona | |
| - `short_victorian_butler` β Refined Victorian butler | |
| - `short_mad_scientist_assistant` β Eccentric scientist's helper | |
| - `short_nature_documentarian` β Wildlife documentary narrator | |
| - `short_time_traveler` β Visitor from another era | |
| - And more... | |
| Set a profile via environment variable: | |
| ```bash | |
| export REACHY_MINI_CUSTOM_PROFILE=short_noir_detective | |
| ``` | |
| Or select one dynamically through the Gradio UI. | |
| ## Tools | |
| The robot has access to several tools during conversation: | |
| | Tool | Description | | |
| |------|-------------| | |
| | `camera` | Take a photo and analyze what the robot sees | | |
| | `dance` | Perform a dance routine | | |
| | `play_emotion` | Express an emotion (happy, sad, surprised, etc.) | | |
| | `move_head` | Move the head to look in a direction | | |
| | `head_tracking` | Enable/disable automatic head tracking | | |
| | `stop_dance` | Stop the current dance | | |
| | `stop_emotion` | Stop the current emotion animation | | |
| | `do_nothing` | Idle action (used during quiet moments) | | |
| ## Project Structure | |
| ``` | |
| reachy_mini_daedalus/ | |
| βββ main.py # Application entrypoint | |
| βββ openai_realtime.py # OpenAI Realtime API handler | |
| βββ console.py # Headless console mode | |
| βββ gradio_personality.py # Gradio UI components | |
| βββ moves.py # Movement manager | |
| βββ config.py # Configuration and environment | |
| βββ prompts.py # Prompt loading and templating | |
| βββ audio/ | |
| β βββ head_wobbler.py # Speech-synchronized head movement | |
| β βββ speech_tapper.py # Audio level detection | |
| βββ tools/ # Available robot tools | |
| β βββ camera.py | |
| β βββ dance.py | |
| β βββ play_emotion.py | |
| β βββ ... | |
| βββ vision/ # Vision processing | |
| β βββ processors.py | |
| β βββ yolo_head_tracker.py | |
| βββ profiles/ # Personality profiles | |
| β βββ default/ | |
| β βββ short_noir_detective/ | |
| β βββ ... | |
| βββ prompts/ # Prompt templates | |
| βββ default_prompt.txt | |
| βββ identities/ | |
| βββ behaviors/ | |
| ``` | |
| ## Development | |
| ```bash | |
| # Install dev dependencies | |
| uv sync --group dev | |
| # Run linting | |
| ruff check . | |
| # Run type checking | |
| mypy . | |
| # Run tests | |
| pytest | |
| ``` | |
| ## License | |
| See LICENSE file for details. | |