# Reachy Mini DanceML Architecture ## System Architecture ```mermaid flowchart TB subgraph Input["🎤 Input Layer"] USER["User Voice"] MIC["Browser Microphone
(Laptop/Mobile)"] end subgraph Streaming["âš¡ Streaming Layer"] GRADIO["Gradio UI
:8042"] end subgraph AI["🧠 AI Layer (OpenAI Realtime)"] ASR["Speech-to-Text
(Whisper)"] REASON["gpt-realtime
+ SYSTEM_INSTRUCTIONS"] TTS["Text-to-Speech"] end subgraph Tools["🔧 11 Tools"] direction TB subgraph Core["Core Movement"] GOTO["goto_pose"] LOOK["look_at"] STOP["stop_movement"] end subgraph Library["Library Moves"] SEARCH["search_moves"] PLAY["play_move"] end subgraph Procedural["Procedural Motion"] GENMOTION["generate_motion"] end subgraph Sequences["Multi-Step"] EXECSEQ["execute_sequence"] end subgraph BuiltIn["Lifecycle & Control"] WAKE["wake_up"] SLEEP["goto_sleep"] MOTOR["motor_control"] end subgraph Reference["Reference"] GUIDE["get_choreography_guide"] end end subgraph Planner["🤖 Sequence Planner (GPT-4.1)"] PLAN["SequencePlanner
+ PLANNER_SYSTEM_PROMPT"] end subgraph Backend["📦 Backend"] HANDLER["RealtimeHandler
(tool dispatch)"] GENERATOR["MovementGenerator
(50Hz motor thread)"] EXECUTOR["SequenceExecutor"] PROCMOVE["ProceduralMove"] MOVELIBRARY["MoveLibrary
(101 moves)"] end subgraph Robot["🤖 Reachy Mini"] HEAD["Head
roll/pitch/yaw"] BODY["Body
yaw ±180°"] ANTENNAS["Antennas
left/right"] SPEAKER["Speaker"] end %% Flow USER --> MIC --> GRADIO --> ASR --> REASON REASON --> TTS --> SPEAKER REASON -->|"function_call"| Tools Tools --> HANDLER %% Tool routing HANDLER --> GENERATOR HANDLER --> EXECUTOR HANDLER --> MOVELIBRARY HANDLER --> PROCMOVE %% Sequence planning EXECSEQ -.->|"plan request"| PLAN PLAN -.->|"SequencePlan"| EXECUTOR %% Execution to hardware GENERATOR --> HEAD GENERATOR --> BODY GENERATOR --> ANTENNAS ``` --- ## Tool Reference (11 Tools) | Tool | Category | Description | |------|----------|-------------| | `goto_pose` | Core | Move to specific head/body angles with duration | | `look_at` | Core | Look at direction (up/down/left/right/floor/ceiling) or 3D point | | `stop_movement` | Core | Stop all movement and return to neutral | | `search_moves` | Library | Semantic search of 101 pre-recorded moves | | `play_move` | Library | Play a named library move | | `generate_motion` | Procedural | Continuous procedural motion with waveforms, drifts, antenna control | | `execute_sequence` | Sequences | Multi-step choreography with timing (uses GPT-4.1 planner) | | `wake_up` | Lifecycle | Play built-in wake animation | | `goto_sleep` | Lifecycle | Play built-in sleep animation | | `motor_control` | Control | Enable/disable motors or gravity compensation | | `get_choreography_guide` | Reference | Load choreography guide for custom movements | --- ## Tool Selection Flow ```mermaid flowchart TD START(("🎤 User
Request")) --> INTENT{"Classify
Intent"} INTENT -->|"look left
tilt head"| SIMPLE["🎯 SIMPLE"] INTENT -->|"stop
freeze"| EMERGENCY["🛑 STOP"] INTENT -->|"show happy
do a dance"| EMOTION["🎭 EMOTION"] INTENT -->|"spiral motion
wiggle antenna"| PROCEDURAL["🌊 PROCEDURAL"] INTENT -->|"peek-a-boo
multi-step"| SEQUENCE["🎬 SEQUENCE"] SIMPLE --> GOTO_POSE["goto_pose()"] EMERGENCY --> STOP_MOVE["stop_movement()"] EMOTION --> SEARCH_LIB["search_moves()"] SEARCH_LIB --> FOUND{"Results?"} FOUND -->|"Yes"| PLAY_MOVE["play_move()"] FOUND -->|"No"| GEN_MOTION PROCEDURAL --> GEN_MOTION["generate_motion()"] SEQUENCE --> EXEC_SEQ["execute_sequence()"] GOTO_POSE --> EXECUTE["⚡ Execute"] STOP_MOVE --> EXECUTE PLAY_MOVE --> EXECUTE GEN_MOTION --> EXECUTE EXEC_SEQ --> EXECUTE EXECUTE --> ROBOT(("🤖 Robot
Moves")) ``` --- ## Component Summary | Layer | Component | Purpose | |-------|-----------|---------| | **Input** | Gradio UI | Web interface + audio capture | | **AI** | OpenAI Realtime API | Speech recognition, reasoning, TTS | | **AI** | GPT-4.1 (Planner) | Sequence planning for multi-step actions | | **Tools** | 11 functions | Intent execution via function calling | | **Backend** | MoveLibrary | 101 pre-recorded HuggingFace moves | | **Backend** | MovementGenerator | 50Hz motor control thread | | **Backend** | ProceduralMove | Waveform-based motion generation | | **Backend** | SequenceExecutor | Step-by-step sequence execution | | **Output** | Reachy Mini SDK | Motor control, audio playback | --- ## System Prompts The agent uses **two system prompts**: 1. **SYSTEM_INSTRUCTIONS** ([realtime_handler.py](../reachy_mini_danceml/realtime_handler.py#L19)) - Main conversational AI instructions - Tool selection guide, physical conventions, physics envelope - ~200 lines 2. **PLANNER_SYSTEM_PROMPT** ([sequence_planner.py](../reachy_mini_danceml/sequence_planner.py#L56)) - GPT-4.1 sequence planning instructions - Step types: move, wait, speak, motion - ~35 lines