Spaces:
Running
Marionette Architecture
This document explains the design of the Marionette app for developers who want to understand, modify, or debug it.
What Marionette Does
Marionette records and plays back head movements (+ optional audio) for the Reachy Mini robot. A user moves the robot's head by hand while the app records the pose at 100 Hz, then replays it as an animated movement. Audio can be recorded from the mic, uploaded as a file, downloaded from YouTube, or picked from the robot's filesystem.
System Overview
βββββββββββββββββββ HTTP (polling) ββββββββββββββββββββ
β Browser (UI) β βββββββββββββββββββββββββββββ β FastAPI Server β
β main.js β ββββββββββββββββββββββββββββ βΊβ (Uvicorn) β
β index.html β POST /api/record β β
β style.css β POST /api/play β marionette/ β
βββββββββββββββββββ GET /api/state β βββ app.py β
β βββ routes.py β
β βββ recording.pyβ
β βββ datasets.py β
β βββ audio.py β
β βββ state.py β
β βββ models.py β
ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββ
β Reachy Mini β
β (Robot SDK) β
ββββββββββββββββββββ
Two-Machine vs Single-Machine
- Reachy Mini: Backend runs on the robot (Linux ARM), browser runs on a laptop. Connected via WiFi.
- Reachy Mini Light: Backend and browser run on the same laptop. The robot connects via USB.
Module Map
| Module | Purpose | Key classes/functions |
|---|---|---|
models.py |
Data types, constants, Pydantic models | RecordingMetadata, RecordingRequest, DatasetEntry, all *Payload classes |
audio.py |
Stateless audio functions | play_wav_chunked(), preload_wav(), play_preloaded_wav() |
state.py |
Thread-safe state read/write | StateMixin._serialize_state(), _set_state(), _set_idle_state() |
recording.py |
Motion capture + playback | RecordingMixin._capture_motion(), _perform_recording(), _perform_playback() |
datasets.py |
Dataset filesystem + HF sync | DatasetMixin._load_dataset_registry(), _sync_dataset(), _check_hf_login() |
routes.py |
HTTP endpoint definitions | register_routes() β all FastAPI route closures |
app.py |
Main class, run loop, robot helpers | Marionette, create_app() |
main.py |
Re-export hub | Imports and re-exports everything for backward compatibility |
motion_models.py |
Lead compensation model | MotionModelRegistry, shifts commands forward to counter mechanical lag |
Threading Model
The app has three types of threads:
- Uvicorn thread β Runs the FastAPI HTTP server. Handles all API requests. This is the thread that calls route handlers in
routes.py. - Main robot thread β Runs
Marionette.run(). Polls for pending jobs (recording or playback) in a 50ms loop. Executes recording/playback synchronously. - Audio threads β Spawned as daemon threads during playback. Push audio chunks to the robot's GStreamer pipeline.
Thread Safety
All shared state is protected by _state_lock (a threading.Lock):
- The HTTP thread writes:
_pending_recording,_pending_playback,_mode - The robot thread reads and clears:
_pending_recording,_pending_playback - Both threads read:
_mode,_recordings,_datasets
The lock is held briefly β never during I/O or network calls.
Cancel Events
_recording_cancel_eventβ Set by the HTTP thread (POST /api/record/stop), checked by the robot thread's capture loop._playback_cancel_eventβ Set by the HTTP thread (POST /api/play/stop), checked by the robot thread's playback loop.
State Machine
POST /api/record
idle ββββββββββββββββββββ queued
β² β
β (robot thread picks up)
β βΌ
β countdown (3s)
β β
β βΌ
βββββββββββββββββββββ recording
β
(duration elapsed or stop)
βΌ
idle
POST /api/play
idle ββββββββββββββββββββ queued
β² β
β (robot thread picks up)
β βΌ
βββββββββββββββββββββ playing
β
(move ends or stop)
βΌ
idle
Additional states: starting_up (during boot animation), error (transient, returns to idle).
Frontend Architecture
The frontend is vanilla JavaScript (no framework). Key patterns:
Polling Loop
The browser polls GET /api/state at regular intervals:
- 1500ms when idle (nothing happening)
- 200ms when active (recording, playing, countdown)
Each poll returns the full app state. The frontend updates the UI accordingly.
Dirty Flag Pattern (Flickering Fix)
The moves list and dataset dropdown are expensive to rebuild (full DOM replacement). Without optimization, they flicker on every poll. The fix:
movesListDirtyanddatasetsDirtyflags start astruerenderMoves()/updateDatasetUI()only run when the flag istrue- Flags are set
trueafter user actions (record, delete, switch dataset, etc.) - During idle polling, only lightweight updates happen (mode badge, play/stop buttons)
NTP-Style Clock Sync
The backend sends server_time with each state response. The frontend computes clockOffset = serverTime - localTime and uses it to accurately display countdown timers and recording progress bars, even when the browser and robot are on different machines with unsynchronized clocks.
Phase Overlay
During countdown and recording, a full-screen overlay shows progress. This uses requestAnimationFrame for smooth 60fps animation, independent of the polling interval.
Dataset Layout
~/reachy_mini_datasets/ # dataset root (configurable)
βββ local_dataset/ # default dataset
β βββ data/ # all recordings live here
β βββ happy-dance.json # motion trajectory
β βββ happy-dance.wav # optional audio
βββ my-custom-set/
β βββ data/
β βββ ...
βββ user-community-set/ # downloaded from HF
βββ data/
βββ ...
The dataset_registry.json file (next to the app) tracks which datasets exist, which is active, and per-dataset metadata (uploaded move IDs, origin, etc.).
Audio Playback Pipeline
Audio is played through the Reachy Mini's GStreamer pipeline using chunk-based pushing:
- Preload: Read WAV file, resample to output rate if needed
- Prime pipeline: Call
start_playing()to initialize GStreamer - Wait for sync: Audio thread waits for
start_signal(set when first motion command is sent) - Push chunks: Feed 20ms audio chunks at ~1.25x real-time
- Drain buffer: Wait for remaining audio to play out
- Cleanup: Call
stop_playing()with timeout (GStreamer can hang)
This push-based approach allows stopping audio at any time, unlike play_sound() which creates an uninterruptible pipeline.
Lead Compensation
Mechanical lag means the robot's actual motion trails the commanded trajectory. The lead compensation model (in motion_models.py) shifts commands forward in time so the actual motion matches the original recording. Parameters (head lead, antenna lead) are tunable in the settings UI.
Testing
Tests live in tests/test_api.py and use FastAPI's TestClient for synchronous HTTP testing without a real robot. The conftest.py creates a Marionette instance with a temporary dataset directory.
Key test patterns:
- HF functions are monkeypatched on
marionette.datasets(the canonical location) - Audio functions are monkeypatched on
marionette.recording(where they're imported) - Recording/playback tests use fake
ReachyMiniobjects with stub methods