CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Environment Setup
The shared virtual environment lives one level up at ../reachy_mini_env. Always activate it first:
source ../reachy_mini_env/bin/activate
Install the package in editable mode (required for entry-point registration):
pip install -e .
Running the App
Run directly (connects to a live Reachy Mini robot):
python talk/main.py
Or via the daemon entry point:
reachy-mini-app run talk
The control panel web UI is served at http://0.0.0.0:8042 while the app runs.
Conversation mode needs an Anthropic API key. Provide it either via the
ANTHROPIC_API_KEY environment variable, or by entering it in the app's web UI
(http://0.0.0.0:8042 β Einstellungen). A key entered in the UI is stored at
~/.config/talk/api_key (chmod 600, outside the repo) and takes precedence over
the env var.
Publishing
reachy-mini-app check # validate the app before publishing
reachy-mini-app publish # publish to Hugging Face Spaces
Architecture
This is a Reachy Mini robot app β a Python package that plugs into the reachy_mini SDK.
App lifecycle (handled by ReachyMiniApp.wrapped_run()):
- Spawns a FastAPI/uvicorn server on
custom_app_url(port 8042) in a background thread - Connects to the robot daemon
- Calls
run(reachy_mini, stop_event)β the main loop - On stop: sets
stop_event, shuts down the web server
Entry-point registration in pyproject.toml:
[project.entry-points."reachy_mini_apps"]
talk = "talk.main:Talk"
State Machine
SLEEPING β (speech detected) β TIME β CONVERSING β (silence/antenna press) β SLEEPING
- SLEEPING: polls
get_DoA()at 5 Hz; wakes afterDOA_DEBOUNCE(3) consecutive speech-detected readings (same mechanism as the recognizer). Ignores audio forDEBOUNCE_AFTER_SPEAK(2 s) after the robot itself spoke so its own goodbye can't re-wake it. - TIME:
wake_up()β speak German datetime with gesture loop, facing the speaker via the captured DoA angle βstart_recording()β enter CONVERSING. - CONVERSING (inner loop):
- LISTENING:
record_utterance()uses RMS-energy VAD with a threshold auto-calibrated from the ambient noise floor; head tracking toward the speaker is non-blocking (set_target) so the audio loop is never frozen. Exits on antenna press, or returns empty afterIDLE_TIMEOUT(25 s) of silence. - PROCESSING:
transcribe(chunks)β Google STT;get_response(messages)β Claude API. - RESPONDING:
_speak_with_gestures()β back to LISTENING. Recording runs continuously throughout the conversation;record_utterance()drains the echo captured during playback. - Exit: antenna press or idle timeout β
stop_recording()βgoto_sleep()β SLEEPING.
- LISTENING:
Helper Modules
talk/tts.py: edge-tts (MS neural,de-DE-KatjaNeural) β MP3 βmedia.play_sound(). Falls back to espeak-ng. Loudness is configurable (0-200, 100 = engine default) via thetts_volumesetting β applied as edge-ttsvolume="+X%"and espeak-a. Blocks for estimated playback duration.talk/stt.py: records from ReSpeaker (16 kHz float32), loudest-channel RMS-energy VAD with a threshold auto-calibrated from ambient noise (logs ambient/threshold/max-RMS for tuning), converts to mono 16-bit WAV, transcribes via Google Speech Recognition. Non-blocking DoA head tracking. Accepts anidle_timeoutso a silent conversation returns to sleep.talk/llm.py: stateless Claude API wrapper. Caller ownsmessageslist. Resolves the API key viaget_api_key()β the web-UI file (~/.config/talk/api_key) first, thenANTHROPIC_API_KEY. Also exposeshas_api_key()andsave_api_key()for the web UI.talk/config.py: JSON-backed non-secret settings store at~/.config/talk/settings.json(get_setting/set_setting). Holdstts_volume. Outside the repo so it is never committed/packaged.
Key SDK APIs
# Direction of Arrival: (angle_radians, speech_detected) or None
# 0 rad = left, Ο/2 = front, Ο = right
doa = reachy_mini.media.get_DoA()
# Audio recording (chunks are (N, 2) float32 arrays at 16 kHz)
reachy_mini.media.start_recording()
chunk = reachy_mini.media.get_audio_sample() # None if no new data
reachy_mini.media.stop_recording()
# Audio playback (async β sleep afterward for estimated duration)
reachy_mini.media.play_sound("/abs/path/to/file.mp3")
# Head movement
reachy_mini.look_at_world(x, y, z, duration=0.5) # forward=+x, right=+y
head_pose = reachy_mini.look_at_world(1.0, y, z, perform_movement=False)
reachy_mini.set_target(head=head_pose, antennas=[left, right])
# Built-in animations (blocking)
reachy_mini.wake_up()
reachy_mini.goto_sleep()
Settings UI
talk/static/ polls GET /status every second. Returns {state, last_user, last_assistant, api_key_set, tts_volume}. Shows colour-coded status chip and conversation bubbles (user on right, assistant on left) during CONVERSING. An Einstellungen section lets the user (a) enter the Anthropic API key β POST /set_api_key ({api_key}) β save_api_key(), with the api_key_set flag driving a "key set?" indicator; and (b) set the voice loudness with a slider β POST /set_config ({tts_volume}) β config.set_setting().