RemiFabre commited on
Commit
a5e54b8
·
1 Parent(s): ed93a8e

Removed use_sim flag since the mini Daemon adapts automatically now

Browse files
README.md CHANGED
@@ -1,6 +1,25 @@
1
  # Reachy Mini conversation demo
2
 
3
- Working repo, we should turn this into a ReachyMini app at some point maybe ?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  ## Installation
6
 
@@ -26,46 +45,120 @@ You can combine extras or include dev dependencies:
26
  uv sync --extra all_vision --group dev
27
  ```
28
 
29
- ### Using pip
30
- Alternatively, you can install using pip in editable mode:
31
 
32
  ```bash
33
- python -m venv .venv # Create a virtual environment
34
  source .venv/bin/activate
 
35
  pip install -e .
36
  ```
37
 
38
- To include optional vision dependencies:
39
- ```
 
 
40
  pip install -e .[local_vision]
41
  pip install -e .[yolo_vision]
42
  pip install -e .[mediapipe_vision]
43
- pip install -e .[all_vision]
44
- ```
45
 
46
- To include dev dependencies:
47
- ```
48
  pip install -e .[dev]
49
  ```
50
 
51
- ## Run
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ```bash
54
  reachy-mini-conversation-demo
55
  ```
56
 
57
- ## Runtime Options
58
 
59
- | Option | Values | Default | Description |
60
- |--------|--------|---------|-------------|
61
- | `--sim` | *(flag)* | off | Run in **simulation mode** (no physical robot required). |
62
- | `--vision` | *(flag)* | off | Enable the **vision system** (must be paired with `--vision-provider`). |
63
- | `--vision-provider` | `local`, `openai` | `local` | Select vision backend:<br>• **local** → Hugging Face VLM (SmolVLM2) runs on your machine.<br>• **openai** → OpenAI multimodal models via API (requires `OPENAI_API_KEY`). |
64
- | `--head-tracking` | *(flag)* | off | Enable **head tracking** (ignored when `--sim` is active). |
65
- | `--debug` | *(flag)* | off | Enable **debug logging** (default log level is INFO). |
66
 
67
- ## Examples
68
- - Simulated run with OpenAI Vision:
69
- ```
70
- reachy-mini-conversation-demo --sim --vision --vision-provider=openai
71
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Reachy Mini conversation demo
2
 
3
+ Conversational demo for the Reachy Mini humanoid robot combining OpenAI's realtime APIs, optional vision pipelines, and choreographed motion libraries. The project currently targets internal validation; this document captures the steps needed to run it today and highlights the gaps we still need to close before a public launch.
4
+
5
+ ## Overview
6
+ - Real-time audio conversation loop powered by the OpenAI realtime API and `fastrtc` for low-latency streaming.
7
+ - Motion control queue that blends scripted dances, recorded emotions, idle breathing, and speech-reactive head wobbling.
8
+ - Optional camera worker with YOLO or MediaPipe-based head tracking and LLM-accessible scene capture.
9
+ - Simulation flag and non-camera modes stubbed in; hardware robot remains the primary path for now.
10
+
11
+ ## Features
12
+ - Async tool dispatch integrates robot motion, camera capture, and optional facial recognition helpers.
13
+ - Gradio web UI provides audio chat and transcript display.
14
+ - Movement manager keeps real-time control in a dedicated thread with safeguards against abrupt pose changes.
15
+ - `.env` driven configuration for OpenAI credentials and Hugging Face caches.
16
+
17
+ ## Requirements
18
+ - Python 3.10 or newer (tested with CPython 3.12.1 via `uv`).
19
+ - Linux environment with build tooling and GStreamer/GTK headers for `PyGObject`:
20
+ - `sudo apt install build-essential pkg-config python3-venv libgirepository1.0-dev gstreamer1.0-plugins-good gstreamer1.0-pulseaudio libatlas-base-dev` (adjust to your distro).
21
+ - Reachy Mini robot or the simulator (simulator wiring is incomplete; see TODOs).
22
+ - Microphone, speakers/headphones, and optionally a USB camera supported by `opencv-python`.
23
 
24
  ## Installation
25
 
 
45
  uv sync --extra all_vision --group dev
46
  ```
47
 
48
+ ### Using pip (Linux)
 
49
 
50
  ```bash
51
+ python -m venv .venv
52
  source .venv/bin/activate
53
+ pip install --upgrade pip setuptools wheel
54
  pip install -e .
55
  ```
56
 
57
+ Install optional extras depending on the feature set you need:
58
+
59
+ ```bash
60
+ # Vision stacks (choose at least one if you plan to run head tracking)
61
  pip install -e .[local_vision]
62
  pip install -e .[yolo_vision]
63
  pip install -e .[mediapipe_vision]
64
+ pip install -e .[all_vision] # installs every vision extra
 
65
 
66
+ # Tooling for development workflows
 
67
  pip install -e .[dev]
68
  ```
69
 
70
+ Some wheels (e.g. PyTorch) are large and require compatible CUDA or CPU builds. Expect the `local_vision` extra to take significantly more disk space than YOLO or MediaPipe.
71
+
72
+ ## Optional dependency groups
73
+
74
+ | Extra | Purpose | Notes |
75
+ |-------|---------|-------|
76
+ | `local_vision` | Run the local VLM (SmolVLM2) through PyTorch/Transformers. | GPU recommended; installs large packages (~2 GB).
77
+ | `yolo_vision` | YOLOv8 tracking via `ultralytics` and `supervision`. | CPU friendly; supports the `--head-tracker yolo` option.
78
+ | `mediapipe_vision` | Lightweight landmark tracking with MediaPipe. | Works on CPU; enables `--head-tracker mediapipe`.
79
+ | `all_vision` | Convenience alias installing every vision extra. | Only use if you need to experiment with all providers.
80
+ | `dev` | Developer tooling (`pytest`, `ruff`). | Add on top of either base or `all_vision` environments.
81
+
82
+ ## Configuration
83
+
84
+ 1. Copy `.env.example` to `.env`.
85
+ 2. Fill in the required values, notably the OpenAI API key.
86
+
87
+ | Variable | Description |
88
+ |----------|-------------|
89
+ | `OPENAI_API_KEY` | Required. Grants access to the OpenAI realtime endpoint.
90
+ | `MODEL_NAME` | Override the realtime model (defaults to `gpt-4o-realtime-preview`).
91
+ | `OPENAI_VISION_MODEL` | Model used when sending captured images back to OpenAI.
92
+ | `HF_HOME` | Cache directory for local Hugging Face downloads.
93
+ | `HF_TOKEN` | Optional token for private Hugging Face models (needed for some emotion libraries).
94
+
95
+ ## Running the demo
96
+
97
+ Activate your virtual environment, ensure the Reachy Mini robot (or simulator) is reachable, then launch:
98
 
99
  ```bash
100
  reachy-mini-conversation-demo
101
  ```
102
 
103
+ The app starts a Gradio UI served locally. When running on a headless host, combine `--headless` with SSH port forwarding or use a browser on the machine itself.
104
 
105
+ ### CLI options
 
 
 
 
 
 
106
 
107
+ | Option | Default | Description |
108
+ |--------|---------|-------------|
109
+ | `--sim` | `False` | Intended to toggle the simulator (currently parsed but not wired through to `ReachyMini`). |
110
+ | `--head-tracker {yolo,mediapipe}` | `None` | Select a head-tracking backend when a camera is available. Requires the matching optional extra. |
111
+ | `--no-camera` | `False` | Run without camera capture or head tracking. |
112
+ | `--headless` | `False` | Suppress launching the Gradio UI (useful on remote machines). |
113
+ | `--debug` | `False` | Enable verbose logging for troubleshooting. |
114
+
115
+ ### Examples
116
+ - Run on hardware with MediaPipe head tracking:
117
+
118
+ ```bash
119
+ reachy-mini-conversation-demo --head-tracker mediapipe
120
+ ```
121
+
122
+ - Disable the camera pipeline (audio-only conversation):
123
+
124
+ ```bash
125
+ reachy-mini-conversation-demo --no-camera
126
+ ```
127
+
128
+ - Prepare for simulator work (flag currently no-op but reserved):
129
+
130
+ ```bash
131
+ reachy-mini-conversation-demo --sim --headless
132
+ ```
133
+
134
+ ## LLM tools exposed to the assistant
135
+
136
+ | Tool | Action | Dependencies |
137
+ |------|--------|--------------|
138
+ | `move_head` | Queue a head pose change (left/right/up/down/front). | Core install only. |
139
+ | `camera` | Capture the latest camera frame and optionally query a vision backend. | Requires camera worker; vision analysis depends on selected extras. |
140
+ | `head_tracking` | Enable or disable face-tracking offsets. | Camera worker with configured head tracker. |
141
+ | `dance` | Queue a dance from `reachy_mini_dances_library`. | Requires access to private choreography library and movement manager. |
142
+ | `stop_dance` | Clear queued dances. | Core install only. |
143
+ | `play_emotion` | Play a recorded emotion clip via Hugging Face assets. | Needs `HF_TOKEN` and the recorded emotions dataset. |
144
+ | `stop_emotion` | Clear queued emotions. | Core install only. |
145
+ | `get_person_name` | Attempt DeepFace-based recognition of the current person. | Disabled by default (`ENABLE_FACE_RECOGNITION=False`); requires `deepface` and a local face database. |
146
+ | `do_nothing` | Explicitly remain idle. | Core install only. |
147
+
148
+ ## Development workflow
149
+ - Install the dev group extras: `uv sync --group dev` or `pip install -e .[dev]`.
150
+ - Run formatting and linting: `ruff check .`.
151
+ - Execute the test suite: `pytest`.
152
+ - When iterating on robot motions, keep the control loop responsive—offload blocking work using the helpers in `tools.py`.
153
+
154
+ ## TODO before public release
155
+ - [ ] Wire the `--sim` flag through to `ReachyMini` and document simulator prerequisites.
156
+ - [ ] Replace the `git+ssh` dependencies with published wheels or read-only URLs so new users can install without deploy keys.
157
+ - [ ] Audit motion, audio, and prompt assets so that `package-data` lists every required non-Python file.
158
+ - [ ] Provide cross-platform installation notes (macOS, Windows) and verify PyGObject availability.
159
+ - [ ] Add integration tests that exercise the movement manager and tool dispatch in simulation mode.
160
+ - [ ] Record screenshots or short clips of the Gradio UI for the README once design stabilises.
161
+
162
+ ## License
163
+
164
+ Reachy Mini Conversation Demo is released under the terms described in `LICENSE`.
src/reachy_mini_conversation_demo/main.py CHANGED
@@ -31,7 +31,7 @@ def main():
31
  logger = setup_logger(args.debug)
32
  logger.info("Starting Reachy Mini Conversation Demo")
33
 
34
- robot = ReachyMini(use_sim=False)
35
 
36
  camera_worker, _, vision_manager = handle_vision_stuff(args, robot)
37
 
 
31
  logger = setup_logger(args.debug)
32
  logger.info("Starting Reachy Mini Conversation Demo")
33
 
34
+ robot = ReachyMini()
35
 
36
  camera_worker, _, vision_manager = handle_vision_stuff(args, robot)
37