# Reachy Mini DanceML Architecture
## System Architecture
```mermaid
flowchart TB
subgraph Input["🎤 Input Layer"]
USER["User Voice"]
MIC["Browser Microphone
(Laptop/Mobile)"]
end
subgraph Streaming["âš¡ Streaming Layer"]
GRADIO["Gradio UI
:8042"]
end
subgraph AI["🧠AI Layer (OpenAI Realtime)"]
ASR["Speech-to-Text
(Whisper)"]
REASON["gpt-realtime
+ SYSTEM_INSTRUCTIONS"]
TTS["Text-to-Speech"]
end
subgraph Tools["🔧 11 Tools"]
direction TB
subgraph Core["Core Movement"]
GOTO["goto_pose"]
LOOK["look_at"]
STOP["stop_movement"]
end
subgraph Library["Library Moves"]
SEARCH["search_moves"]
PLAY["play_move"]
end
subgraph Procedural["Procedural Motion"]
GENMOTION["generate_motion"]
end
subgraph Sequences["Multi-Step"]
EXECSEQ["execute_sequence"]
end
subgraph BuiltIn["Lifecycle & Control"]
WAKE["wake_up"]
SLEEP["goto_sleep"]
MOTOR["motor_control"]
end
subgraph Reference["Reference"]
GUIDE["get_choreography_guide"]
end
end
subgraph Planner["🤖 Sequence Planner (GPT-4.1)"]
PLAN["SequencePlanner
+ PLANNER_SYSTEM_PROMPT"]
end
subgraph Backend["📦 Backend"]
HANDLER["RealtimeHandler
(tool dispatch)"]
GENERATOR["MovementGenerator
(50Hz motor thread)"]
EXECUTOR["SequenceExecutor"]
PROCMOVE["ProceduralMove"]
MOVELIBRARY["MoveLibrary
(101 moves)"]
end
subgraph Robot["🤖 Reachy Mini"]
HEAD["Head
roll/pitch/yaw"]
BODY["Body
yaw ±180°"]
ANTENNAS["Antennas
left/right"]
SPEAKER["Speaker"]
end
%% Flow
USER --> MIC --> GRADIO --> ASR --> REASON
REASON --> TTS --> SPEAKER
REASON -->|"function_call"| Tools
Tools --> HANDLER
%% Tool routing
HANDLER --> GENERATOR
HANDLER --> EXECUTOR
HANDLER --> MOVELIBRARY
HANDLER --> PROCMOVE
%% Sequence planning
EXECSEQ -.->|"plan request"| PLAN
PLAN -.->|"SequencePlan"| EXECUTOR
%% Execution to hardware
GENERATOR --> HEAD
GENERATOR --> BODY
GENERATOR --> ANTENNAS
```
---
## Tool Reference (11 Tools)
| Tool | Category | Description |
|------|----------|-------------|
| `goto_pose` | Core | Move to specific head/body angles with duration |
| `look_at` | Core | Look at direction (up/down/left/right/floor/ceiling) or 3D point |
| `stop_movement` | Core | Stop all movement and return to neutral |
| `search_moves` | Library | Semantic search of 101 pre-recorded moves |
| `play_move` | Library | Play a named library move |
| `generate_motion` | Procedural | Continuous procedural motion with waveforms, drifts, antenna control |
| `execute_sequence` | Sequences | Multi-step choreography with timing (uses GPT-4.1 planner) |
| `wake_up` | Lifecycle | Play built-in wake animation |
| `goto_sleep` | Lifecycle | Play built-in sleep animation |
| `motor_control` | Control | Enable/disable motors or gravity compensation |
| `get_choreography_guide` | Reference | Load choreography guide for custom movements |
---
## Tool Selection Flow
```mermaid
flowchart TD
START(("🎤 User
Request")) --> INTENT{"Classify
Intent"}
INTENT -->|"look left
tilt head"| SIMPLE["🎯 SIMPLE"]
INTENT -->|"stop
freeze"| EMERGENCY["🛑 STOP"]
INTENT -->|"show happy
do a dance"| EMOTION["🎠EMOTION"]
INTENT -->|"spiral motion
wiggle antenna"| PROCEDURAL["🌊 PROCEDURAL"]
INTENT -->|"peek-a-boo
multi-step"| SEQUENCE["🎬 SEQUENCE"]
SIMPLE --> GOTO_POSE["goto_pose()"]
EMERGENCY --> STOP_MOVE["stop_movement()"]
EMOTION --> SEARCH_LIB["search_moves()"]
SEARCH_LIB --> FOUND{"Results?"}
FOUND -->|"Yes"| PLAY_MOVE["play_move()"]
FOUND -->|"No"| GEN_MOTION
PROCEDURAL --> GEN_MOTION["generate_motion()"]
SEQUENCE --> EXEC_SEQ["execute_sequence()"]
GOTO_POSE --> EXECUTE["âš¡ Execute"]
STOP_MOVE --> EXECUTE
PLAY_MOVE --> EXECUTE
GEN_MOTION --> EXECUTE
EXEC_SEQ --> EXECUTE
EXECUTE --> ROBOT(("🤖 Robot
Moves"))
```
---
## Component Summary
| Layer | Component | Purpose |
|-------|-----------|---------|
| **Input** | Gradio UI | Web interface + audio capture |
| **AI** | OpenAI Realtime API | Speech recognition, reasoning, TTS |
| **AI** | GPT-4.1 (Planner) | Sequence planning for multi-step actions |
| **Tools** | 11 functions | Intent execution via function calling |
| **Backend** | MoveLibrary | 101 pre-recorded HuggingFace moves |
| **Backend** | MovementGenerator | 50Hz motor control thread |
| **Backend** | ProceduralMove | Waveform-based motion generation |
| **Backend** | SequenceExecutor | Step-by-step sequence execution |
| **Output** | Reachy Mini SDK | Motor control, audio playback |
---
## System Prompts
The agent uses **two system prompts**:
1. **SYSTEM_INSTRUCTIONS** ([realtime_handler.py](../reachy_mini_danceml/realtime_handler.py#L19))
- Main conversational AI instructions
- Tool selection guide, physical conventions, physics envelope
- ~200 lines
2. **PLANNER_SYSTEM_PROMPT** ([sequence_planner.py](../reachy_mini_danceml/sequence_planner.py#L56))
- GPT-4.1 sequence planning instructions
- Step types: move, wait, speak, motion
- ~35 lines