marionette / TESTING.md
RemiFabre
Fix tester-reported bugs: stop responsiveness, dropdown race, debounce
7476d4e

Testing β€” Marionette

Run the tests

The quickest way to run everything and log results:

cd marionette
python tests/run_tests.py

This will:

  1. Auto-detect your OS (Linux / macOS / Windows)
  2. Ask which robot model you have connected (None / Lite / Wireless)
  3. Run unit tests with verbose output
  4. Run E2E tests for each available Playwright browser
  5. Print a per-class test catalog and coverage matrix
  6. Save results to tests/test_results.json
  7. Export marionette/static/test_coverage.json for the web UI

After running, commit and push your results (see below).

Share your test results

After running tests, commit and push these two files:

  1. tests/test_results.json β€” accumulated results across all testers
  2. marionette/static/test_coverage.json β€” exported matrix for the web page
cd marionette
git add tests/test_results.json marionette/static/test_coverage.json
git commit -m "Test results: <your OS> / <your robot model>"
git push

If you're working on a fork, open a PR with just these two files.

Install dependencies

cd marionette
pip install -e ".[dev]"

For E2E browser tests you also need Playwright browsers:

pip install pytest-playwright
playwright install --with-deps chromium firefox

WebKit (Safari's browser engine) is available on some platforms (playwright install webkit) but is not required.

Running individual suites

Unit tests only (~4 s)

cd marionette
pytest tests/test_api.py

No hardware or running server needed β€” tests use FastAPI's TestClient.

E2E single browser (~20 s)

cd marionette
pytest tests/e2e --browser chromium

E2E multi-browser

cd marionette
pytest tests/e2e --browser chromium --browser firefox --browser webkit

Hardware integration tests (local)

Requires a physical Reachy Mini robot connected and powered on. This runs tests on the local machine (useful when developing directly on the robot or when the robot is accessible locally):

cd marionette
pytest tests/test_hardware.py -m hardware

Or via the test runner:

cd marionette
python tests/run_tests.py --hardware

Hardware tests on the robot (recommended)

Hardware tests should run on the robot itself because:

  • Audio playback (play_sound()) requires local speakers β€” WAV files are silently dropped over WebRTC
  • The robot has direct hardware access (local mic, speakers, motors)
  • Running locally eliminates network latency from test measurements

One-time SSH setup:

# Copy your SSH key to the robot (password: pollen)
ssh-copy-id pollen@reachy-mini.local

# Verify passwordless access
ssh pollen@reachy-mini.local echo OK

Run tests remotely:

cd marionette
python tests/run_on_robot.py

This will:

  1. rsync the marionette package + tests to /tmp/marionette_test/ on the robot
  2. Install dev deps if missing (pytest, httpx, pytest-json-report, scipy)
  3. Run pytest -m hardware on the robot, streaming output in real-time
  4. Fetch the JSON report back and print a summary

Options:

python tests/run_on_robot.py --dry-run             # show what would be synced
python tests/run_on_robot.py --host 192.168.1.42   # custom host/IP
python tests/run_on_robot.py --user pollen          # custom SSH user
python tests/run_on_robot.py -k test_playback       # extra pytest args

Or via the main test runner:

python tests/run_tests.py --on-robot
python tests/run_tests.py --on-robot --host 192.168.1.42

All tests at once (no hardware)

cd marionette
pytest tests/ --browser chromium

What's tested

Unit tests (tests/test_api.py)

Class Tests Description
TestSlugify 7 Move label slug generation
TestStateEndpoint 9 GET /api/state shape and initial values
TestStartingUpMode 3 Starting-up mode rejects commands
TestRecordEndpoint 9 POST /api/record validation and state transitions
TestPlayEndpoint 3 POST /api/play validation
TestStopEndpoints 3 POST /api/play/stop and /api/record/stop
TestMoveDelete 4 DELETE /api/moves/:id and file cleanup
TestDatasets 11 Dataset create, select, root change, origin
TestRegistryPersistence 3 Registry file survival across restarts
TestMovesRefresh 4 Move listing and metadata
TestExperiments 4 Feature toggles and experimental settings
TestHfAutoLogin 5 HF auto-login detection and caching
TestSensorData 1 Sensor data dummy endpoint
TestUploadAudio 7 POST /api/upload-audio validation and integration
TestMotionModelEndpoint 4 POST /api/motion-model enable, set, persist
TestCommunityDatasets 3 Community dataset listing and download validation
TestCorruptData 4 Malformed JSON and corrupt registry recovery
TestConcurrentStateChanges 4 Concurrent operations rejected when busy
TestDurationEdgeCases 3 Duration boundary validation (gt=0.5, le=300)
TestSyncDatasetExtended 4 Sync endpoint edge cases and validation
TestMicAgcConfig 9 Mic AGC disable/restore with mock USB device
TestApiContracts 9 API response shapes (refactoring protection)
TestAudioOnlyRecording 5 Audio-only (mic-only, no motion) recording mode
TestStopEndpointsExtended 2 Stop endpoint edge cases
TestVersionEndpoint 3 GET /api/version shape and values

E2E browser tests (tests/e2e/test_ui.py)

Class Tests Description
TestPageLoad 3 Page loads and main sections visible
TestIdleState 4 Idle state display and controls
TestSimplifiedForm 3 Simplified form layout
TestFormValidation 5 Form fields and HTML5 validation
TestRecordingSubmission 1 Recording submission changes mode
TestDatasetUI 2 Dataset UI elements visible
TestRecordingLifecycle 4 Recording submit, stop, and injected move visibility
TestPlaybackLifecycle 3 Play button, queued playback, and move metadata
TestDeleteMove 3 Delete button, confirm/cancel dialog handling
TestCreateDataset 4 New dataset form show/hide/create/validate
TestSwitchDataset 2 Dataset switching and dropdown population
TestAudioSourceRadios 3 Audio source radio buttons and upload area toggle
TestSettingsPanel 3 Settings expand, experimental toggle, root display
TestFormEdgeCases 5 Empty/long/special labels and localStorage persistence
TestCommunitySection 3 Community datasets section expand and fetch button
TestHfUploadSection 3 HF username field and sync button presence
TestResponsiveness 3 UI timing β€” stop reflects immediately, dropdown doesn't revert
TestMicOnlyOption 3 Mic-only radio visible, selectable, audio-only badge
TestDatasetRootHint 1 Dataset root hint shows "on this computer"

Hardware tests (tests/test_hardware.py)

Class Tests Description
TestHardwareStartup 1 Robot reaches idle after startup
TestHardwareRecording 1 Record silent and verify motion capture
TestHardwarePlayback 2 Playback and delete (silent)
TestFullPipeline 4 Full record β†’ verify files β†’ replay β†’ delete lifecycle
TestMotionAccuracy 3 Synthetic playback accuracy β€” reference vs observed poses
TestMultiDuration 7 Recording and playback across 1s/3s/5s/10s durations
TestPerformance 3 Startup, recording, and playback latency benchmarks
TestExistingRecordingPlayback 2 Play back existing audio and silent recordings
TestRecordingRoundTrip 2 Record β†’ playback fidelity and timing verification
TestAntennaAndBodyYaw 2 Verify antenna and body_yaw data in recordings
TestPlaybackAntennas 1 Synthetic antenna oscillation playback
TestStopDuringPlayback 2 Stop mid-playback and during goto-start-pose
TestStopDuringRecording 1 Stop mid-recording, verify partial save
TestMultipleRecordPlayCycles 1 5 back-to-back record/play cycles
TestPlaybackWithCorruptFile 2 Deleted/corrupt JSON during playback
TestHardwareAudio 3 Audio recording and playback (may skip on mic issues)

View the matrix without running tests

cd marionette
python tests/show_matrix.py

Matrix format

Coverage Matrix (latest result per combination)

             β”‚ Unit Tests β”‚ Chromium  β”‚ Firefox   β”‚ WebKit
─────────────┼────────────┼───────────┼───────────┼──────────
Linux/None   β”‚ βœ… Feb 20  β”‚ βœ… Feb 20 β”‚ β€”         β”‚ β€”
Linux/Lite   β”‚ β€”          β”‚ β€”         β”‚ β€”         β”‚ β€”
macOS/None   β”‚ β€”          β”‚ β€”         β”‚ β€”         β”‚ β€”
Windows/None β”‚ β€”          β”‚ β€”         β”‚ β€”         β”‚ β€”
...

Each cell shows the latest result for that OS + robot + test-suite combination. β€” means never tested. A "Hardware" column appears when hardware test results are available.

Web coverage

The Space's index.html displays a Test Coverage section that loads marionette/static/test_coverage.json (exported by run_tests.py). This is visible on the HuggingFace Space page, not the app's web GUI at localhost:8042.

Preview locally

To see the coverage table as it appears on the Space, open index.html in a browser. Because it fetches a JSON file, you need a local server:

cd marionette
python -m http.server 8080

Then open http://localhost:8080 β€” you should see the Space landing page with the Test Coverage matrix near the bottom.

Commit and push results

After running the tests, commit the updated results so the Space page reflects the latest coverage (see Share your test results above for the full command).

Markers

  • @pytest.mark.hardware β€” tests that require a physical robot (skipped by default unless you run with -m hardware)

Deploying to the robot (Wireless)

For manual testing or quick iteration on a Wireless robot, use the deploy script instead of reinstalling via the dashboard:

cd marionette
./deploy_wireless.sh                  # default: reachy-mini.local
./deploy_wireless.sh 192.168.1.42    # custom IP
./deploy_wireless.sh --sync-only     # sync without starting

This rsyncs your local marionette/ into the robot's site-packages (same location the dashboard installs to), stops the current app, starts marionette via the daemon API, and streams logs to your terminal.

To stop the app:

./stop_app.sh                         # default: reachy-mini.local
./stop_app.sh 192.168.1.42           # custom IP

See the script headers for prerequisites (SSH key, daemon, Wireless-only).

Project layout

deploy_wireless.sh   # Deploy + start on Wireless robot (dev workflow)
stop_app.sh          # Stop the running app via daemon API

tests/
β”œβ”€β”€ conftest.py          # Shared fixtures (TestClient, temp paths, make_wav_bytes)
β”œβ”€β”€ test_api.py          # Unit tests β€” backend API
β”œβ”€β”€ test_hardware.py     # Hardware integration tests (real robot)
β”œβ”€β”€ pose_utils.py        # Trajectory comparison utilities
β”œβ”€β”€ e2e/
β”‚   β”œβ”€β”€ conftest.py      # Playwright server fixture (port 18042)
β”‚   └── test_ui.py       # E2E browser tests
β”œβ”€β”€ run_tests.py         # Test runner + matrix logger + web export
β”œβ”€β”€ run_on_robot.py      # Remote test runner (SSH + rsync to robot)
β”œβ”€β”€ run_smoke_on_robot.py # Smoke test runner (SSH + rsync)
β”œβ”€β”€ smoke_test_on_robot.py # Smoke test script (runs on robot)
β”œβ”€β”€ show_matrix.py       # Matrix viewer
β”œβ”€β”€ REMOTE_TESTING.md    # SSH/rsync workflow documentation
β”œβ”€β”€ HARDWARE_TEST_PLAN.md # Hardware test plan
└── test_results.json    # Auto-generated results log (git-ignored)

marionette/static/
└── test_coverage.json   # Web matrix export (committed)