reachy_mini_danceml / docs /SDK_DOCUMENTATION.md
Boopster's picture
feat: Implement voice-controlled movement generation for Reachy Mini with real-time audio processing, new tests, and documentation.
c8edd3d
# Reachy Mini DanceML SDK Documentation
This documentation covers the SDK methods for controlling Reachy Mini movements and the data format for dance sequences.
## Table of Contents
- [Overview](#overview)
- [Dataset Format](#dataset-format)
- [Core Classes](#core-classes)
- [Movement Tools](#movement-tools)
- [Usage Examples](#usage-examples)
---
## Overview
The Reachy Mini DanceML SDK enables:
- **Voice-controlled movements** via OpenAI Realtime API
- **Keyframe-based animations** with cubic spline interpolation
- **Simple pose commands** for direct head positioning
---
## Available Datasets
Pollen Robotics provides two HuggingFace datasets with pre-recorded movements:
| Dataset | Records | Description |
|---------|---------|-------------|
| `pollen-robotics/reachy-mini-dances-library` | 20 | Dance moves (pecking, bobbing, swaying) |
| `pollen-robotics/reachy-mini-emotions-library` | 81 | Emotional expressions (wonder, fear, joy, etc.) |
Both datasets share the same schema and can be used interchangeably with this SDK.
### Dataset Schema
| Field | Type | Description |
|-------|------|-------------|
| `description` | `string` | Human-readable description of the movement |
| `time` | `List[float]` | Timestamps in seconds from animation start |
| `set_target_data` | `List[TargetData]` | Array of pose targets at each timestamp |
### TargetData Structure
Each element in `set_target_data` contains:
```python
{
"head": [[4x4 homogeneous transformation matrix]],
"antennas": [left_angle, right_angle], # in radians
"body_yaw": 0.0, # body rotation (typically 0)
"check_collision": false # collision check flag
}
```
### Head Pose Matrix
The `head` field is a 4x4 homogeneous transformation matrix representing the head orientation:
```
[[r11, r12, r13, tx],
[r21, r22, r23, ty],
[r31, r32, r33, tz],
[0, 0, 0, 1 ]]
```
Where:
- The 3x3 upper-left submatrix encodes rotation (roll, pitch, yaw)
- The last column `[tx, ty, tz, 1]` encodes translation
### Loading the Datasets
```python
from datasets import load_dataset
# Dance moves (requires HuggingFace login)
dances = load_dataset("pollen-robotics/reachy-mini-dances-library")
# Emotions library (requires HuggingFace login)
emotions = load_dataset("pollen-robotics/reachy-mini-emotions-library")
# Access a dance move
dance = dances['train'][0]
print(f"Description: {dance['description']}")
# Output: "A sharp, forward, chicken-like pecking motion."
# Access an emotion
emotion = emotions['train'][0]
print(f"Description: {emotion['description']}")
# Output: "When you discover something extraordinary..."
# Both have the same structure
print(f"Duration: {emotion['time'][-1]} seconds")
print(f"Frames: {len(emotion['time'])}")
```
### Example Emotion Descriptions
The emotions library includes expressive movements such as:
- **Wonder**: "When you discover something extraordinary"
- **Fear**: "You look around without really knowing where to look"
- **Joy**: Celebratory movements
- **Surprise**: Reactive startle responses
- **Curiosity**: Investigative head tilts
---
## Core Classes
### KeyFrame
A single keyframe in an animation sequence.
```python
from reachy_mini_danceml.movement_tools import KeyFrame
@dataclass
class KeyFrame:
t: float # Time in seconds from animation start
head: dict # {"roll": 0, "pitch": 0, "yaw": 0} in degrees
antennas: Tuple[float, float] # (left, right) antenna angles in degrees
```
#### Methods
| Method | Description |
|--------|-------------|
| `KeyFrame.from_dict(data)` | Create KeyFrame from a dictionary |
#### Example
```python
# Create keyframes for a nodding animation
keyframes = [
KeyFrame(t=0.0, head={"roll": 0, "pitch": 0, "yaw": 0}, antennas=(0, 0)),
KeyFrame(t=0.3, head={"roll": 0, "pitch": -15, "yaw": 0}, antennas=(10, 10)),
KeyFrame(t=0.6, head={"roll": 0, "pitch": 10, "yaw": 0}, antennas=(-5, -5)),
KeyFrame(t=1.0, head={"roll": 0, "pitch": 0, "yaw": 0}, antennas=(0, 0)),
]
```
---
### GeneratedMove
A Move generated from keyframes with cubic spline interpolation.
```python
from reachy_mini_danceml.movement_generator import GeneratedMove
class GeneratedMove(Move):
def __init__(self, keyframes: List[KeyFrame])
@property
def duration(self) -> float
def evaluate(self, t: float) -> Tuple[np.ndarray, np.ndarray, float]
```
#### Properties
| Property | Type | Description |
|----------|------|-------------|
| `duration` | `float` | Total animation duration in seconds |
#### Methods
| Method | Parameters | Returns | Description |
|--------|------------|---------|-------------|
| `evaluate(t)` | `t: float` (time in seconds) | `(head_pose, antennas, body_yaw)` | Interpolate pose at time t |
#### Return Values from `evaluate()`
- `head_pose`: 4x4 numpy array (homogeneous transformation matrix)
- `antennas`: numpy array `[left, right]` in radians
- `body_yaw`: float (always 0.0)
#### Example
```python
from reachy_mini_danceml.movement_generator import GeneratedMove
from reachy_mini_danceml.movement_tools import KeyFrame
keyframes = [
KeyFrame(t=0.0, head={"yaw": 0}),
KeyFrame(t=1.0, head={"yaw": 30}),
KeyFrame(t=2.0, head={"yaw": 0}),
]
move = GeneratedMove(keyframes)
print(f"Duration: {move.duration} seconds")
# Get pose at 0.5 seconds
head, antennas, body_yaw = move.evaluate(0.5)
```
---
### MoveLibrary
Manages loading and indexing of dance and emotion datasets.
```python
from reachy_mini_danceml.dataset_loader import MoveLibrary
library = MoveLibrary()
library.load()
# Search
results = library.search_moves("happy")
# Get Record
record = library.get_move("joy_jump")
```
### MovementGenerator
Generates and executes movements on Reachy Mini.
```python
from reachy_mini_danceml.movement_generator import MovementGenerator
class MovementGenerator:
def __init__(self, reachy: ReachyMini)
def create_from_keyframes(self, keyframes) -> GeneratedMove
async def goto_pose(self, roll=0, pitch=0, yaw=0, duration=0.5) -> None
async def play_move(self, move: Move) -> None
async def stop(self) -> None
```
#### Methods
| Method | Parameters | Description |
|--------|------------|-------------|
| `create_from_keyframes(keyframes)` | `List[KeyFrame]` or `List[dict]` | Create a GeneratedMove from keyframes |
| `goto_pose(roll, pitch, yaw, duration)` | Angles in degrees, duration in seconds | Move head to specific pose |
| `play_move(move)` | `Move` object | Play an animation asynchronously |
| `stop()` | None | Stop current movement, return to neutral |
#### Angle Limits
| Parameter | Range | Direction |
|-----------|-------|-----------|
| `roll` | -30° to 30° | Positive = tilt right |
| `pitch` | -30° to 30° | Positive = look up |
| `yaw` | -45° to 45° | Positive = look left |
| `antennas` | -60° to 60° | Each antenna independently |
#### Example
```python
from reachy_mini import ReachyMini
from reachy_mini_danceml.movement_generator import MovementGenerator
async def demo(reachy: ReachyMini):
generator = MovementGenerator(reachy)
# Simple pose
await generator.goto_pose(roll=0, pitch=10, yaw=-20, duration=0.5)
# Keyframe animation
keyframes = [
{"t": 0.0, "head": {"yaw": 0}, "antennas": [0, 0]},
{"t": 0.5, "head": {"yaw": 30}, "antennas": [20, -20]},
{"t": 1.0, "head": {"yaw": 0}, "antennas": [0, 0]},
]
move = generator.create_from_keyframes(keyframes)
await generator.play_move(move)
```
---
## Movement Tools
These are OpenAI function-calling tool schemas for voice control integration.
### PLAY_MOVE_TOOL
Play a pre-defined movement from the library by its name/ID.
```python
{
"type": "function",
"name": "play_move",
"description": "Play a pre-defined movement from the library by its name (e.g., 'joy', 'fear', 'chicken_dance'). Prefer this over creating sequences manually.",
"parameters": {
"properties": {
"name": {"type": "string", "description": "Name or ID of the movement"}
},
"required": ["name"]
}
}
```
### SEARCH_MOVES_TOOL
Search the library for available movements.
```python
{
"type": "function",
"name": "search_moves",
"description": "Search the movement library for available expressions or dances.",
"parameters": {
"properties": {
"query": {"type": "string", "description": "Keywords to search for"}
},
"required": ["query"]
}
}
```
### GET_CHOREOGRAPHY_GUIDE_TOOL
Retrieve physics rules and examples for custom generation.
```python
{
"type": "function",
"name": "get_choreography_guide",
"description": "Read the choreography guide to learn how to create safe and expressive custom movements. Call this BEFORE using create_sequence for new moves."
}
```
### GOTO_POSE_TOOL
Move the robot's head to a specific pose.
```python
{
"type": "function",
"name": "goto_pose",
"parameters": {
"properties": {
"roll": {"type": "number", "description": "Roll angle (-30 to 30°)"},
"pitch": {"type": "number", "description": "Pitch angle (-30 to 30°)"},
"yaw": {"type": "number", "description": "Yaw angle (-45 to 45°)"},
"duration": {"type": "number", "description": "Duration in seconds"}
}
}
}
```
### CREATE_SEQUENCE_TOOL
Create and play an animated movement sequence from keyframes.
```python
{
"type": "function",
"name": "create_sequence",
"parameters": {
"properties": {
"keyframes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"t": {"type": "number", "description": "Time in seconds"},
"head": {
"properties": {
"roll": {"type": "number"},
"pitch": {"type": "number"},
"yaw": {"type": "number"}
}
},
"antennas": {
"type": "array",
"items": {"type": "number"},
"description": "[left, right] in degrees (-60 to 60)"
}
},
"required": ["t"]
}
}
},
"required": ["keyframes"]
}
}
```
### STOP_MOVEMENT_TOOL
Stop any currently playing movement and return to neutral position.
```python
{
"type": "function",
"name": "stop_movement",
"description": "Stop current movement and return to neutral"
}
```
---
## Hybrid AI Workflow
The SDK is designed for a **Hybrid Generative/Retrieval** architecture to optimize context usage.
### Recommended Agent Logic
1. **Retrieval First**: Always try `search_moves(query)` first.
2. **Play by Name**: If a match is found, use `play_move(name)`. This uses 0 tokens for movement data.
3. **On-Demand Learning**: If no match is found, call `get_choreography_guide()` to load physics rules.
4. **Safe Generation**: Finally, use `create_sequence(keyframes)` to generate a custom move using the loaded rules.
---
## Usage Examples
### Example 1: Wave Animation
```python
wave_keyframes = [
{"t": 0.0, "head": {"roll": 0, "yaw": 0}, "antennas": [0, 0]},
{"t": 0.3, "head": {"roll": 0, "yaw": 0}, "antennas": [30, -30]},
{"t": 0.6, "head": {"roll": 0, "yaw": 0}, "antennas": [-30, 30]},
{"t": 0.9, "head": {"roll": 0, "yaw": 0}, "antennas": [30, -30]},
{"t": 1.2, "head": {"roll": 0, "yaw": 0}, "antennas": [0, 0]},
]
```
### Example 2: Curious Head Tilt
```python
curious_keyframes = [
{"t": 0.0, "head": {"roll": 0, "pitch": 0, "yaw": 0}},
{"t": 0.4, "head": {"roll": 15, "pitch": 5, "yaw": 10}},
{"t": 1.5, "head": {"roll": 15, "pitch": 5, "yaw": 10}},
{"t": 2.0, "head": {"roll": 0, "pitch": 0, "yaw": 0}},
]
```
### Example 3: Excited Celebration
```python
excited_keyframes = [
{"t": 0.0, "head": {"pitch": 0}, "antennas": [0, 0]},
{"t": 0.2, "head": {"pitch": -10}, "antennas": [40, 40]},
{"t": 0.4, "head": {"pitch": 5}, "antennas": [-20, -20]},
{"t": 0.6, "head": {"pitch": -10}, "antennas": [40, 40]},
{"t": 0.8, "head": {"pitch": 5}, "antennas": [-20, -20]},
{"t": 1.0, "head": {"pitch": 0}, "antennas": [0, 0]},
]
```
---
## AI Agent Output Format
When building an AI agent to generate movements for Reachy Mini, the output should match this format:
### For Simple Poses
```json
{
"function": "goto_pose",
"arguments": {
"roll": 0,
"pitch": 10,
"yaw": -20,
"duration": 0.5
}
}
```
### For Animated Sequences
```json
{
"function": "create_sequence",
"arguments": {
"keyframes": [
{"t": 0.0, "head": {"roll": 0, "pitch": 0, "yaw": 0}, "antennas": [0, 0]},
{"t": 0.5, "head": {"roll": 10, "pitch": -5, "yaw": 20}, "antennas": [15, -15]},
{"t": 1.0, "head": {"roll": 0, "pitch": 0, "yaw": 0}, "antennas": [0, 0]}
]
}
}
```
This format allows seamless integration with the OpenAI Realtime API for voice-controlled robot movements.