Spaces:
Running
Running
File size: 4,925 Bytes
b5a13fe 2fda523 77c6ffa 2fda523 77c6ffa 2fda523 77c6ffa b5a13fe 2fda523 b5a13fe | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | # CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Environment Setup
The virtual environment lives one level up at `../reachy_mini_env`. Always activate it first:
```bash
source ../reachy_mini_env/bin/activate
```
Install the package in editable mode (required for entry-point registration):
```bash
pip install -e .
```
### System dependency (Raspberry Pi / Reachy Mini wireless)
```bash
sudo apt-get install espeak-ng # text-to-speech synthesis
```
### Face recognition model (one-time download)
Face recognition runs **locally** using ONNX Runtime (no cloud account needed).
On first run the app downloads the InsightFace MobileFaceNet model (~17 MB)
from GitHub and caches it at `recognizer/models/w600k_mbf.onnx`.
Requires internet access the first time only; fully offline thereafter.
Requires **64-bit Raspberry Pi OS** (onnxruntime ships pre-built aarch64 wheels).
## Running the App
Run directly (connects to a live Reachy Mini robot):
```bash
python recognizer/main.py
```
Or via the daemon entry point (used when the robot's daemon manages app lifecycle):
```bash
reachy-mini-app run recognizer
```
The control panel web UI is served at `http://0.0.0.0:8042` while the app runs.
## Publishing
```bash
reachy-mini-app check # validate the app before publishing
reachy-mini-app publish # publish to Hugging Face Spaces
```
## Architecture
This is a **Reachy Mini robot app** β a Python package that plugs into the `reachy_mini` SDK.
**Entry point**: `recognizer/main.py` β `Recognizer` class inheriting from `ReachyMiniApp` (ABC from `reachy_mini`).
**App lifecycle** (handled by `ReachyMiniApp.wrapped_run()`):
1. Spawns a FastAPI/uvicorn server on `custom_app_url` (port 8042) in a background thread
2. Connects to the robot daemon (auto-detects localhost vs. network β LOCAL backend on wireless robot)
3. Calls `Recognizer.run(reachy_mini, stop_event)` β the main state-machine loop
4. On stop: sets `stop_event`, shuts down the web server
**State machine** (`recognizer/main.py`):
```
SLEEPING β(speech detected Γ 3)β WAKING β ACTIVE β SLEEPING
β (unknown face)
ENROLLING β SLEEPING
```
- **SLEEPING**: polls `media.get_DoA()` at 5 Hz; robot stays in sleep pose. Three consecutive `speech_detected=True` readings (debounced) trigger a wake-up.
- **WAKING**: calls `wake_up()` (built-in animation + sound), then `look_at_world()` toward the DoA angle.
- **ACTIVE**: captures camera frames every 0.5 s, runs `face_recognition.face_locations()` + `face_recognition.face_encodings()` (HOG model, 2Γ downsampled for speed). Gentle head-scan idle animation via `set_target()`. 15 s timeout β back to sleep.
- **ENROLLING**: robot has detected an unrecognised face; waits for name to be submitted via the web UI (`POST /set_name`). Stores encoding in `face_db.json`, says "Nice to meet you, <name>!", then sleeps.
**Helper modules**:
- `recognizer/face_db.py` β local face recognition via ONNX Runtime. `load()` warms up the ONNX session (downloads model on first run) and returns the embedding DB dict. `find_match(frame_bgr, db)` detects with OpenCV Haar cascade, embeds with MobileFaceNet, matches by cosine similarity (threshold 0.35); raises `NoFaceDetected` if no face. `add_face(name, frame_bgr, db)` enrolls a face. DB stored in `recognizer/face_db.json`.
- `recognizer/tts.py` β synthesises text via `espeak-ng -s 140 -w <tmp.wav>`, plays via `media.play_sound()`, then sleeps to let playback finish.
**Settings UI** (`recognizer/static/`):
- `index.html` / `main.js` / `style.css` β polls `GET /status` every second to show current state; reveals a name-entry form when state is `"enrolling"`.
- REST endpoints defined in `run()` via `self.settings_app` (FastAPI): `GET /status`, `POST /set_name`.
**Root-level `index.html` / `style.css`**: HuggingFace Spaces landing page β separate from the in-app UI in `recognizer/static/`.
**Entry-point registration** in `pyproject.toml`:
```toml
[project.entry-points."reachy_mini_apps"]
recognizer = "recognizer.main:Recognizer"
```
## Key APIs
```python
# Direction of Arrival from the ReSpeaker mic array
# Returns (angle_radians, speech_detected) or None
# 0 rad = left, Ο/2 = front/back, Ο = right
doa = reachy_mini.media.get_DoA()
# Camera frame (BGR uint8 numpy array)
frame = reachy_mini.media.get_frame()
# Built-in animations (blocking)
reachy_mini.wake_up()
reachy_mini.goto_sleep()
# Smooth head movement (blocking)
reachy_mini.look_at_world(x, y, z, duration=0.5) # forward=+x, right=+y
# Immediate head pose (non-blocking, use set_target for idle animation)
reachy_mini.set_target(head=pose_4x4)
# Audio
reachy_mini.media.play_sound("/abs/path/to/file.wav") # async; sleep afterward
```
|