Abduallah Abuhassan commited on
Commit
b82aa95
·
1 Parent(s): 7129565

Initialize Git LFS and add project files with binary tracking

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .env.example +30 -0
  2. .gitattributes +2 -0
  3. Dockerfile +6 -7
  4. docs/assets/conversation_app_arch.svg +0 -0
  5. docs/scheme.mmd +63 -0
  6. external_content/external_profiles/starter_profile/instructions.txt +6 -0
  7. external_content/external_profiles/starter_profile/tools.txt +11 -0
  8. external_content/external_tools/starter_custom_tool.py +33 -0
  9. pyproject.toml +142 -0
  10. src/reachy_mini_conversation_app.egg-info/PKG-INFO +364 -0
  11. src/reachy_mini_conversation_app.egg-info/SOURCES.txt +81 -0
  12. src/reachy_mini_conversation_app.egg-info/dependency_links.txt +1 -0
  13. src/reachy_mini_conversation_app.egg-info/entry_points.txt +5 -0
  14. src/reachy_mini_conversation_app.egg-info/requires.txt +39 -0
  15. src/reachy_mini_conversation_app.egg-info/top_level.txt +1 -0
  16. src/reachy_mini_conversation_app/__init__.py +1 -0
  17. src/reachy_mini_conversation_app/audio/__init__.py +1 -0
  18. src/reachy_mini_conversation_app/audio/head_wobbler.py +181 -0
  19. src/reachy_mini_conversation_app/audio/speech_tapper.py +268 -0
  20. src/reachy_mini_conversation_app/camera_worker.py +241 -0
  21. src/reachy_mini_conversation_app/config.py +223 -0
  22. src/reachy_mini_conversation_app/console.py +377 -0
  23. src/reachy_mini_conversation_app/dance_emotion_moves.py +154 -0
  24. src/reachy_mini_conversation_app/gradio_personality.py +316 -0
  25. src/reachy_mini_conversation_app/headless_personality.py +102 -0
  26. src/reachy_mini_conversation_app/headless_personality_ui.py +287 -0
  27. src/reachy_mini_conversation_app/images/reachymini_avatar.png +3 -0
  28. src/reachy_mini_conversation_app/images/user_avatar.png +3 -0
  29. src/reachy_mini_conversation_app/main.py +246 -0
  30. src/reachy_mini_conversation_app/moves.py +849 -0
  31. src/reachy_mini_conversation_app/ollama_handler.py +558 -0
  32. src/reachy_mini_conversation_app/profiles/__init__.py +1 -0
  33. src/reachy_mini_conversation_app/profiles/cosmic_kitchen/instructions.txt +49 -0
  34. src/reachy_mini_conversation_app/profiles/cosmic_kitchen/tools.txt +8 -0
  35. src/reachy_mini_conversation_app/profiles/default/instructions.txt +1 -0
  36. src/reachy_mini_conversation_app/profiles/default/tools.txt +8 -0
  37. src/reachy_mini_conversation_app/profiles/example/instructions.txt +3 -0
  38. src/reachy_mini_conversation_app/profiles/example/sweep_look.py +127 -0
  39. src/reachy_mini_conversation_app/profiles/example/tools.txt +13 -0
  40. src/reachy_mini_conversation_app/profiles/mars_rover/instructions.txt +25 -0
  41. src/reachy_mini_conversation_app/profiles/mars_rover/tools.txt +8 -0
  42. src/reachy_mini_conversation_app/profiles/short_bored_teenager/instructions.txt +1 -0
  43. src/reachy_mini_conversation_app/profiles/short_bored_teenager/tools.txt +8 -0
  44. src/reachy_mini_conversation_app/profiles/short_captain_circuit/instructions.txt +1 -0
  45. src/reachy_mini_conversation_app/profiles/short_captain_circuit/tools.txt +8 -0
  46. src/reachy_mini_conversation_app/profiles/short_chess_coach/instructions.txt +1 -0
  47. src/reachy_mini_conversation_app/profiles/short_chess_coach/tools.txt +8 -0
  48. src/reachy_mini_conversation_app/profiles/short_hype_bot/instructions.txt +1 -0
  49. src/reachy_mini_conversation_app/profiles/short_hype_bot/tools.txt +8 -0
  50. src/reachy_mini_conversation_app/profiles/short_mad_scientist_assistant/instructions.txt +1 -0
.env.example ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ollama LLM server
2
+ OLLAMA_BASE_URL=http://localhost:11434
3
+ MODEL_NAME="llama3.2"
4
+
5
+ # STT (faster-whisper model size: tiny, base, small, medium, large-v3)
6
+ STT_MODEL=base
7
+
8
+ # TTS (edge-tts voice name, e.g. en-US-AriaNeural, en-GB-SoniaNeural)
9
+ TTS_VOICE=en-US-AriaNeural
10
+
11
+ # Local vision model (only used with --local-vision CLI flag)
12
+ LOCAL_VISION_MODEL=HuggingFaceTB/SmolVLM2-2.2B-Instruct
13
+
14
+ # Cache for local VLM (only used with --local-vision CLI flag)
15
+ HF_HOME=./cache
16
+
17
+ # Hugging Face token for accessing datasets/models
18
+ HF_TOKEN=
19
+
20
+ # Profile selection (defaults to "default" when unset)
21
+ REACHY_MINI_CUSTOM_PROFILE="example"
22
+
23
+ # Optional external profile/tool directories
24
+ # REACHY_MINI_EXTERNAL_PROFILES_DIRECTORY=external_content/external_profiles
25
+ # REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY=external_content/external_tools
26
+
27
+ # Optional: discover and auto-load all tools found in REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY,
28
+ # even if they are not listed in the selected profile's tools.txt.
29
+ # This is convenient for downloaded tools used with built-in/default profiles.
30
+ # AUTOLOAD_EXTERNAL_TOOLS=1
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.gif filter=lfs diff=lfs merge=lfs -text
37
+ *.png filter=lfs diff=lfs merge=lfs -text
Dockerfile CHANGED
@@ -1,10 +1,12 @@
1
  FROM python:3.12-slim
2
 
3
- # System dependencies for faster-whisper (ctranslate2) and audio processing
4
  RUN apt-get update && apt-get install -y --no-install-recommends \
5
  build-essential \
6
  ffmpeg \
7
  libsndfile1 \
 
 
8
  && rm -rf /var/lib/apt/lists/*
9
 
10
  # Create non-root user (required by HF Spaces)
@@ -15,13 +17,10 @@ ENV HOME=/home/user \
15
 
16
  WORKDIR /home/user/app
17
 
18
- # Install Python dependencies
19
- COPY --chown=user requirements.txt .
20
- RUN pip install --no-cache-dir --upgrade pip && \
21
- pip install --no-cache-dir -r requirements.txt
22
-
23
- # Copy application code
24
  COPY --chown=user . .
 
 
25
 
26
  # Expose Gradio port
27
  EXPOSE 7860
 
1
  FROM python:3.12-slim
2
 
3
+ # System dependencies for faster-whisper (ctranslate2) and audio/image processing
4
  RUN apt-get update && apt-get install -y --no-install-recommends \
5
  build-essential \
6
  ffmpeg \
7
  libsndfile1 \
8
+ libgl1 \
9
+ libglib2.0-0 \
10
  && rm -rf /var/lib/apt/lists/*
11
 
12
  # Create non-root user (required by HF Spaces)
 
17
 
18
  WORKDIR /home/user/app
19
 
20
+ # Install Python dependencies and the app itself
 
 
 
 
 
21
  COPY --chown=user . .
22
+ RUN pip install --no-cache-dir --upgrade pip && \
23
+ pip install --no-cache-dir .
24
 
25
  # Expose Gradio port
26
  EXPOSE 7860
docs/assets/conversation_app_arch.svg ADDED
docs/scheme.mmd ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ config:
3
+ layout: dagre
4
+ flowchart:
5
+ htmlLabels: true
6
+ ---
7
+ flowchart TB
8
+ User(["<span style='font-size:16px;font-weight:bold;'>User</span><br><span style='font-size:13px;color:#01579b;'>Person interacting with system</span>"])
9
+ -- audio stream -->
10
+ UI@{ label: "<span style='font-size:16px;font-weight:bold;'>UI Layer</span><br><span style='font-size:13px;color:#0277bd;'>Gradio/Console</span>" }
11
+
12
+ UI -- audio stream -->
13
+ OpenAI@{ label: "<span style='font-size:17px;font-weight:bold;'>gpt-realtime API</span><br><span style='font-size:13px; color:#7b1fa2;'>Audio+Tool Calls+Vision</span>" }
14
+
15
+ OpenAI -- audio stream -->
16
+ Motion@{ label: "<span style='font-size:16px;font-weight:bold;'>Motion Control</span><br><span style='font-size:13px;color:#f57f17;'>Audio Sync + Tracking</span>" }
17
+
18
+ OpenAI -- tool calls -->
19
+ Handlers@{ label: "<span style='font-size:16px;font-weight:bold;'>Tool Layer</span><br><span style='font-size:12px;color:#f9a825;'>Built-in tools + profile-local tools<br/>+ external tools (optional)</span>" }
20
+
21
+ Profiles@{ label: "<span style='font-size:16px;font-weight:bold;'>Selected Profile</span><br><span style='font-size:12px;color:#6a1b9a;'>built-in or external<br/>instructions.txt + tools.txt</span>" }
22
+
23
+ Profiles -- defines enabled tools --> Handlers
24
+
25
+ Handlers -- movement
26
+ requests --> Motion
27
+
28
+ Handlers -- camera frames, head tracking -->
29
+ Camera@{ label: "<span style='font-size:16px;font-weight:bold;'>Camera Worker</span><br><span style='font-size:13px;color:#f57f17;'>Frame Buffer + Head Tracking</span>" }
30
+
31
+ Handlers -. image for
32
+ analysis .-> OpenAI
33
+
34
+ Camera -- head tracking --> Motion
35
+
36
+ Camera -. frames .->
37
+ Vision@{ label: "<span style='font-size:16px;font-weight:bold;'>Vision Processor</span><br><span style='font-size:13px;color:#7b1fa2;'>Local VLM (optional)</span>" }
38
+
39
+ Vision -. description .-> Handlers
40
+
41
+ Robot@{ label: "<span style='font-size:16px;font-weight:bold;'>reachy_mini</span><br><span style='font-size:13px;color:#c62828;'>Robot Control Library</span>" }
42
+ -- camera
43
+ frames --> Camera
44
+
45
+ Motion -- commands --> Robot
46
+
47
+ Handlers -- results --> OpenAI
48
+
49
+ User:::userStyle
50
+ UI:::uiStyle
51
+ OpenAI:::aiStyle
52
+ Motion:::coreStyle
53
+ Profiles:::toolStyle
54
+ Handlers:::toolStyle
55
+ Camera:::coreStyle
56
+ Vision:::aiStyle
57
+ Robot:::hardwareStyle
58
+ classDef userStyle fill:#e1f5fe,stroke:#01579b,stroke-width:3px
59
+ classDef uiStyle fill:#b3e5fc,stroke:#0277bd,stroke-width:2px
60
+ classDef aiStyle fill:#e1bee7,stroke:#7b1fa2,stroke-width:3px
61
+ classDef coreStyle fill:#fff9c4,stroke:#f57f17,stroke-width:2px
62
+ classDef hardwareStyle fill:#ef9a9a,stroke:#c62828,stroke-width:3px
63
+ classDef toolStyle fill:#fffde7,stroke:#f9a825,stroke-width:1px
external_content/external_profiles/starter_profile/instructions.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ You are a helpful Reachy Mini assistant running from an external profile.
2
+
3
+ When asked to demonstrate your custom greeting, use the `starter_custom_tool` tool.
4
+ You can also dance and show emotions like the built-in profiles.
5
+
6
+ Be friendly and concise, and explain that you're using an external profile/tool setup when asked about yourself.
external_content/external_profiles/starter_profile/tools.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file is an explicit allow-list.
2
+ # Every tool name listed below must be either:
3
+ # - a built-in tool from src/reachy_mini_conversation_app/tools/
4
+ # - or an external tool file in TOOLS_DIRECTORY (e.g. external_tools/starter_custom_tool.py)
5
+
6
+ dance
7
+ stop_dance
8
+ play_emotion
9
+ stop_emotion
10
+ move_head
11
+ starter_custom_tool
external_content/external_tools/starter_custom_tool.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Example external tool implementation."""
2
+
3
+ import logging
4
+ from typing import Any, Dict
5
+
6
+ from reachy_mini_conversation_app.tools.core_tools import Tool, ToolDependencies
7
+
8
+
9
+ logger = logging.getLogger(__name__)
10
+
11
+
12
+ class StarterCustomTool(Tool):
13
+ """Placeholder custom tool - demonstrates external tool loading."""
14
+
15
+ name = "starter_custom_tool"
16
+ description = "A placeholder custom tool loaded from outside the library"
17
+ parameters_schema = {
18
+ "type": "object",
19
+ "properties": {
20
+ "message": {
21
+ "type": "string",
22
+ "description": "Optional message to include in the response",
23
+ },
24
+ },
25
+ "required": [],
26
+ }
27
+
28
+ async def __call__(self, deps: ToolDependencies, **kwargs: Any) -> Dict[str, Any]:
29
+ """Execute the placeholder tool."""
30
+ message = kwargs.get("message", "Hello from custom tool!")
31
+ logger.info(f"Tool call: starter_custom_tool message={message}")
32
+
33
+ return {"status": "success", "message": message}
pyproject.toml ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [build-system]
2
+ requires = ["setuptools"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "reachy_mini_conversation_app"
7
+ version = "0.2.2"
8
+ authors = [{ name = "Pollen Robotics", email = "contact@pollen-robotics.com" }]
9
+ description = ""
10
+ readme = "README.md"
11
+ requires-python = ">=3.10"
12
+ dependencies = [
13
+ #Media
14
+ "aiortc>=1.13.0",
15
+ "fastrtc>=0.0.34",
16
+ "gradio==5.50.1.dev1",
17
+ "huggingface-hub==1.3.0",
18
+ "opencv-python>=4.12.0.88",
19
+
20
+ #Environment variables
21
+ "python-dotenv",
22
+
23
+ #Ollama LLM
24
+ "ollama>=0.4",
25
+
26
+ #STT (Speech-to-Text)
27
+ "faster-whisper>=1.0",
28
+
29
+ #TTS (Text-to-Speech)
30
+ "edge-tts>=7.0",
31
+ "miniaudio>=1.60",
32
+
33
+ #Reachy mini
34
+ "reachy_mini_dances_library",
35
+ "reachy_mini_toolbox",
36
+ "reachy-mini>=1.3.1",
37
+ "eclipse-zenoh~=1.7.0",
38
+ "gradio_client>=1.13.3",
39
+ ]
40
+
41
+ [project.optional-dependencies]
42
+ reachy_mini_wireless = [
43
+ "PyGObject>=3.42.2,<=3.46.0",
44
+ "gst-signalling>=1.1.2",
45
+ ]
46
+ local_vision = [
47
+ "torch>=2.1",
48
+ "transformers==5.0.0rc2",
49
+ "num2words",
50
+ ]
51
+ yolo_vision = [
52
+ "ultralytics",
53
+ "supervision",
54
+ ]
55
+ mediapipe_vision = [
56
+ "mediapipe==0.10.14",
57
+ ]
58
+ all_vision = [
59
+ "torch>=2.1",
60
+ "transformers==5.0.0rc2",
61
+ "num2words",
62
+ "ultralytics",
63
+ "supervision",
64
+ "mediapipe==0.10.14",
65
+ ]
66
+
67
+ [dependency-groups]
68
+ dev = [
69
+ "pytest",
70
+ "pytest-asyncio",
71
+ "ruff==0.12.0",
72
+ "mypy==1.18.2",
73
+ "pre-commit",
74
+ "types-requests",
75
+ "python-semantic-release>=10.5.3",
76
+ ]
77
+
78
+ [project.scripts]
79
+ reachy-mini-conversation-app = "reachy_mini_conversation_app.main:main"
80
+
81
+ [project.entry-points."reachy_mini_apps"]
82
+ reachy_mini_conversation_app = "reachy_mini_conversation_app.main:ReachyMiniConversationApp"
83
+
84
+ [tool.setuptools]
85
+ package-dir = { "" = "src" }
86
+ include-package-data = true
87
+
88
+ [tool.setuptools.packages.find]
89
+ where = ["src"]
90
+
91
+ [tool.setuptools.package-data]
92
+ reachy_mini_conversation_app = [
93
+ "images/*",
94
+ "static/*",
95
+ ".env.example",
96
+ "demos/**/*.txt",
97
+ "prompts_library/*.txt",
98
+ "profiles/**/*.txt",
99
+ "prompts/**/*.txt",
100
+ ]
101
+
102
+ [tool.ruff]
103
+ line-length = 119
104
+ exclude = [".venv", "dist", "build", "**/__pycache__", "*.egg-info", ".mypy_cache", ".pytest_cache"]
105
+
106
+ [tool.ruff.lint]
107
+ select = [
108
+ "E", # pycodestyle errors
109
+ "F", # pyflakes
110
+ "W", # pycodestyle warnings
111
+ "I", # isort
112
+ "C4", # flake8-comprehensions
113
+ "D", # pydocstyle
114
+ ]
115
+ ignore = [
116
+ "E501", # handled by formatter
117
+ "D100", # ignore missing module docstrings
118
+ "D203", # blank line before class docstring (conflicts with D211)
119
+ "D213", # summary on second line (conflicts with D212)
120
+ ]
121
+
122
+ [tool.ruff.lint.isort]
123
+ length-sort = true
124
+ lines-after-imports = 2
125
+ no-lines-before = ["standard-library", "local-folder"]
126
+ known-local-folder = ["reachy_mini_conversation_app"]
127
+ known-first-party = ["reachy_mini", "reachy_mini_dances_library", "reachy_mini_toolbox"]
128
+ split-on-trailing-comma = true
129
+
130
+ [tool.ruff.format]
131
+ quote-style = "double"
132
+ indent-style = "space"
133
+ skip-magic-trailing-comma = false
134
+ line-ending = "auto"
135
+
136
+ [tool.mypy]
137
+ python_version = "3.12"
138
+ files = ["src/"]
139
+ ignore_missing_imports = true
140
+ strict = true
141
+ show_error_codes = true
142
+ warn_unused_ignores = true
src/reachy_mini_conversation_app.egg-info/PKG-INFO ADDED
@@ -0,0 +1,364 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: reachy_mini_conversation_app
3
+ Version: 0.2.2
4
+ Author-email: Pollen Robotics <contact@pollen-robotics.com>
5
+ Requires-Python: >=3.10
6
+ Description-Content-Type: text/markdown
7
+ License-File: LICENSE
8
+ Requires-Dist: aiortc>=1.13.0
9
+ Requires-Dist: fastrtc>=0.0.34
10
+ Requires-Dist: gradio==5.50.1.dev1
11
+ Requires-Dist: huggingface-hub==1.3.0
12
+ Requires-Dist: opencv-python>=4.12.0.88
13
+ Requires-Dist: python-dotenv
14
+ Requires-Dist: ollama>=0.4
15
+ Requires-Dist: faster-whisper>=1.0
16
+ Requires-Dist: edge-tts>=7.0
17
+ Requires-Dist: miniaudio>=1.60
18
+ Requires-Dist: reachy_mini_dances_library
19
+ Requires-Dist: reachy_mini_toolbox
20
+ Requires-Dist: reachy-mini>=1.3.1
21
+ Requires-Dist: eclipse-zenoh~=1.7.0
22
+ Requires-Dist: gradio_client>=1.13.3
23
+ Provides-Extra: reachy-mini-wireless
24
+ Requires-Dist: PyGObject<=3.46.0,>=3.42.2; extra == "reachy-mini-wireless"
25
+ Requires-Dist: gst-signalling>=1.1.2; extra == "reachy-mini-wireless"
26
+ Provides-Extra: local-vision
27
+ Requires-Dist: torch>=2.1; extra == "local-vision"
28
+ Requires-Dist: transformers==5.0.0rc2; extra == "local-vision"
29
+ Requires-Dist: num2words; extra == "local-vision"
30
+ Provides-Extra: yolo-vision
31
+ Requires-Dist: ultralytics; extra == "yolo-vision"
32
+ Requires-Dist: supervision; extra == "yolo-vision"
33
+ Provides-Extra: mediapipe-vision
34
+ Requires-Dist: mediapipe==0.10.14; extra == "mediapipe-vision"
35
+ Provides-Extra: all-vision
36
+ Requires-Dist: torch>=2.1; extra == "all-vision"
37
+ Requires-Dist: transformers==5.0.0rc2; extra == "all-vision"
38
+ Requires-Dist: num2words; extra == "all-vision"
39
+ Requires-Dist: ultralytics; extra == "all-vision"
40
+ Requires-Dist: supervision; extra == "all-vision"
41
+ Requires-Dist: mediapipe==0.10.14; extra == "all-vision"
42
+ Dynamic: license-file
43
+
44
+ ---
45
+ title: Reachy Mini Conversation App
46
+ emoji: 🎤
47
+ colorFrom: red
48
+ colorTo: blue
49
+ sdk: static
50
+ pinned: false
51
+ short_description: Talk with Reachy Mini !
52
+ tags:
53
+ - reachy_mini
54
+ - reachy_mini_python_app
55
+ ---
56
+
57
+ # Reachy Mini conversation app
58
+
59
+ Conversational app for the Reachy Mini robot combining OpenAI's realtime APIs, vision pipelines, and choreographed motion libraries.
60
+
61
+ ![Reachy Mini Dance](docs/assets/reachy_mini_dance.gif)
62
+
63
+ ## Table of contents
64
+ - [Overview](#overview)
65
+ - [Architecture](#architecture)
66
+ - [Installation](#installation)
67
+ - [Configuration](#configuration)
68
+ - [Running the app](#running-the-app)
69
+ - [LLM tools](#llm-tools-exposed-to-the-assistant)
70
+ - [Advanced features](#advanced-features)
71
+ - [Contributing](#contributing)
72
+ - [License](#license)
73
+
74
+ ## Overview
75
+ - Real-time audio conversation loop powered by the OpenAI realtime API and `fastrtc` for low-latency streaming.
76
+ - Vision processing uses gpt-realtime by default (when camera tool is used), with optional local vision processing using SmolVLM2 model running on-device (CPU/GPU/MPS) via `--local-vision` flag.
77
+ - Layered motion system queues primary moves (dances, emotions, goto poses, breathing) while blending speech-reactive wobble and head-tracking.
78
+ - Async tool dispatch integrates robot motion, camera capture, and optional head-tracking capabilities through a Gradio web UI with live transcripts.
79
+
80
+ ## Architecture
81
+
82
+ The app follows a layered architecture connecting the user, AI services, and robot hardware:
83
+
84
+ <p align="center">
85
+ <img src="docs/assets/conversation_app_arch.svg" alt="Architecture Diagram" width="600"/>
86
+ </p>
87
+
88
+ ## Installation
89
+
90
+ > [!IMPORTANT]
91
+ > Before using this app, you need to install [Reachy Mini's SDK](https://github.com/pollen-robotics/reachy_mini/).<br>
92
+ > Windows support is currently experimental and has not been extensively tested. Use with caution.
93
+
94
+ <details open>
95
+ <summary><b>Using uv (recommended)</b></summary>
96
+
97
+ Set up the project quickly using [uv](https://docs.astral.sh/uv/):
98
+
99
+ ```bash
100
+ # macOS (Homebrew)
101
+ uv venv --python /opt/homebrew/bin/python3.12 .venv
102
+
103
+ # Linux / Windows (Python in PATH)
104
+ uv venv --python python3.12 .venv
105
+
106
+ source .venv/bin/activate
107
+ uv sync
108
+ ```
109
+
110
+ > **Note:** To reproduce the exact dependency set from this repo's `uv.lock`, run `uv sync --frozen`. This ensures `uv` installs directly from the lockfile without re-resolving or updating any versions.
111
+
112
+ **Install optional features:**
113
+ ```bash
114
+ uv sync --extra reachy_mini_wireless # Wireless Reachy Mini with GStreamer support
115
+ uv sync --extra local_vision # Local PyTorch/Transformers vision
116
+ uv sync --extra yolo_vision # YOLO-based head-tracking
117
+ uv sync --extra mediapipe_vision # MediaPipe-based head-tracking
118
+ uv sync --extra all_vision # All vision features
119
+ ```
120
+
121
+ Combine extras or include dev dependencies:
122
+ ```bash
123
+ uv sync --extra all_vision --group dev
124
+ ```
125
+
126
+ </details>
127
+
128
+ <details>
129
+ <summary><b>Using pip</b></summary>
130
+
131
+ ```bash
132
+ python -m venv .venv
133
+ source .venv/bin/activate
134
+ pip install -e .
135
+ ```
136
+
137
+ **Install optional features:**
138
+ ```bash
139
+ pip install -e .[reachy_mini_wireless] # Wireless Reachy Mini
140
+ pip install -e .[local_vision] # Local vision stack
141
+ pip install -e .[yolo_vision] # YOLO-based vision
142
+ pip install -e .[mediapipe_vision] # MediaPipe-based vision
143
+ pip install -e .[all_vision] # All vision features
144
+ pip install -e .[dev] # Development tools
145
+ ```
146
+
147
+ Some wheels (like PyTorch) are large and require compatible CUDA or CPU builds—make sure your platform matches the binaries pulled in by each extra.
148
+
149
+ </details>
150
+
151
+ ### Optional dependency groups
152
+
153
+ | Extra | Purpose | Notes |
154
+ |-------|---------|-------|
155
+ | `reachy_mini_wireless` | Wireless Reachy Mini with GStreamer support | Required for wireless versions of Reachy Mini, includes GStreamer dependencies. |
156
+ | `local_vision` | Run the local VLM (SmolVLM2) through PyTorch/Transformers | GPU recommended. Ensure compatible PyTorch builds for your platform. |
157
+ | `yolo_vision` | YOLOv11n head tracking via `ultralytics` and `supervision` | Runs on CPU (default). GPU improves performance. Supports the `--head-tracker yolo` option. |
158
+ | `mediapipe_vision` | Lightweight landmark tracking with MediaPipe | Works on CPU. Enables `--head-tracker mediapipe`. |
159
+ | `all_vision` | Convenience alias installing every vision extra | Install when you want the flexibility to experiment with every provider. |
160
+ | `dev` | Developer tooling (`pytest`, `ruff`, `mypy`) | Development-only dependencies. Use `--group dev` with uv or `[dev]` with pip. |
161
+
162
+ **Note:** `dev` is a dependency group (not an optional dependency). With uv, use `--group dev`. With pip, use `[dev]`.
163
+
164
+ ## Configuration
165
+
166
+ 1. Copy `.env.example` to `.env`
167
+ 2. Fill in required values, notably the OpenAI API key
168
+
169
+ | Variable | Description |
170
+ |----------|-------------|
171
+ | `OPENAI_API_KEY` | Required. Grants access to the OpenAI realtime endpoint. |
172
+ | `MODEL_NAME` | Override the realtime model (defaults to `gpt-realtime`). Used for both conversation and vision (unless `--local-vision` flag is used). |
173
+ | `HF_HOME` | Cache directory for local Hugging Face downloads (only used with `--local-vision` flag, defaults to `./cache`). |
174
+ | `HF_TOKEN` | Optional token for Hugging Face access (for gated/private assets). |
175
+ | `LOCAL_VISION_MODEL` | Hugging Face model path for local vision processing (only used with `--local-vision` flag, defaults to `HuggingFaceTB/SmolVLM2-2.2B-Instruct`). |
176
+
177
+ ## Running the app
178
+
179
+ Activate your virtual environment, then launch:
180
+
181
+ ```bash
182
+ reachy-mini-conversation-app
183
+ ```
184
+
185
+ > [!TIP]
186
+ > Make sure the Reachy Mini daemon is running before launching the app. If you see a `TimeoutError`, it means the daemon isn't started. See [Reachy Mini's SDK](https://github.com/pollen-robotics/reachy_mini/) for setup instructions.
187
+
188
+ The app runs in console mode by default. Add `--gradio` to launch a web UI at http://127.0.0.1:7860/ (required for simulation mode). Vision and head-tracking options are described in the CLI table below.
189
+
190
+ ### CLI options
191
+
192
+ | Option | Default | Description |
193
+ |--------|---------|-------------|
194
+ | `--head-tracker {yolo,mediapipe}` | `None` | Select a head-tracking backend when a camera is available. YOLO is implemented locally, MediaPipe comes from the `reachy_mini_toolbox` package. Requires the matching optional extra. |
195
+ | `--no-camera` | `False` | Run without camera capture or head tracking. |
196
+ | `--local-vision` | `False` | Use local vision model (SmolVLM2) for periodic image processing instead of gpt-realtime vision. Requires `local_vision` extra to be installed. |
197
+ | `--gradio` | `False` | Launch the Gradio web UI. Without this flag, runs in console mode. Required when running in simulation mode. |
198
+ | `--robot-name` | `None` | Optional. Connect to a specific robot by name when running multiple daemons on the same subnet. See [Multiple robots on the same subnet](#advanced-features). |
199
+ | `--debug` | `False` | Enable verbose logging for troubleshooting. |
200
+
201
+ ### Examples
202
+
203
+ ```bash
204
+ # Run with MediaPipe head tracking
205
+ reachy-mini-conversation-app --head-tracker mediapipe
206
+
207
+ # Run with local vision processing (requires local_vision extra)
208
+ reachy-mini-conversation-app --local-vision
209
+
210
+ # Audio-only conversation (no camera)
211
+ reachy-mini-conversation-app --no-camera
212
+
213
+ # Launch with Gradio web interface
214
+ reachy-mini-conversation-app --gradio
215
+ ```
216
+
217
+ ## LLM tools exposed to the assistant
218
+
219
+ | Tool | Action | Dependencies |
220
+ |------|--------|--------------|
221
+ | `move_head` | Queue a head pose change (left/right/up/down/front). | Core install only. |
222
+ | `camera` | Capture the latest camera frame and send it to gpt-realtime for vision analysis. | Requires camera worker. Uses gpt-realtime vision by default. |
223
+ | `head_tracking` | Enable or disable head-tracking offsets (not identity recognition - only detects and tracks head position). | Camera worker with configured head tracker (`--head-tracker`). |
224
+ | `dance` | Queue a dance from `reachy_mini_dances_library`. | Core install only. |
225
+ | `stop_dance` | Clear queued dances. | Core install only. |
226
+ | `play_emotion` | Play a recorded emotion clip via Hugging Face datasets. | Core install only. Uses the default open emotions dataset: [`pollen-robotics/reachy-mini-emotions-library`](https://huggingface.co/datasets/pollen-robotics/reachy-mini-emotions-library). |
227
+ | `stop_emotion` | Clear queued emotions. | Core install only. |
228
+ | `do_nothing` | Explicitly remain idle. | Core install only. |
229
+
230
+ ## Advanced features
231
+
232
+ Built-in motion content is published as open Hugging Face datasets:
233
+ - Emotions: [`pollen-robotics/reachy-mini-emotions-library`](https://huggingface.co/datasets/pollen-robotics/reachy-mini-emotions-library)
234
+ - Dances: [`pollen-robotics/reachy-mini-dances-library`](https://huggingface.co/datasets/pollen-robotics/reachy-mini-dances-library)
235
+
236
+ <details>
237
+ <summary><b>Custom profiles</b></summary>
238
+
239
+ Create custom profiles with dedicated instructions and enabled tools.
240
+
241
+ Set `REACHY_MINI_CUSTOM_PROFILE=<name>` to load `src/reachy_mini_conversation_app/profiles/<name>/` (see `.env.example`). If unset, the `default` profile is used.
242
+
243
+ Each profile should include `instructions.txt` (prompt text). `tools.txt` (list of allowed tools) is recommended. If missing for a non-default profile, the app falls back to `profiles/default/tools.txt`. Profiles can optionally contain custom tool implementations.
244
+
245
+ **Custom instructions:**
246
+
247
+ Write plain-text prompts in `instructions.txt`. To reuse shared prompt pieces, add lines like:
248
+ ```
249
+ [passion_for_lobster_jokes]
250
+ [identities/witty_identity]
251
+ ```
252
+ Each placeholder pulls the matching file under `src/reachy_mini_conversation_app/prompts/` (nested paths allowed). See `src/reachy_mini_conversation_app/profiles/example/` for a reference layout.
253
+
254
+ **Enabling tools:**
255
+
256
+ List enabled tools in `tools.txt`, one per line. Prefix with `#` to comment out:
257
+ ```
258
+ play_emotion
259
+ # move_head
260
+
261
+ # My custom tool defined locally
262
+ sweep_look
263
+ ```
264
+ Tools are resolved first from Python files in the profile folder (custom tools), then from the core library `src/reachy_mini_conversation_app/tools/` (like `dance`, `head_tracking`).
265
+
266
+ **Custom tools:**
267
+
268
+ On top of built-in tools found in the core library, you can implement custom tools specific to your profile by adding Python files in the profile folder.
269
+ Custom tools must subclass `reachy_mini_conversation_app.tools.core_tools.Tool` (see `profiles/example/sweep_look.py`).
270
+
271
+ **Edit personalities from the UI:**
272
+
273
+ When running with `--gradio`, open the "Personality" accordion:
274
+ - Select among available profiles (folders under `src/reachy_mini_conversation_app/profiles/`) or the built‑in default.
275
+ - Click "Apply" to update the current session instructions live.
276
+ - Create a new personality by entering a name and instructions text. It stores files under `profiles/<name>/` and copies `tools.txt` from the `default` profile.
277
+
278
+ Note: The "Personality" panel updates the conversation instructions. Tool sets are loaded at startup from `tools.txt` and are not hot‑reloaded.
279
+
280
+ </details>
281
+
282
+ <details>
283
+ <summary><b>Locked profile mode</b></summary>
284
+
285
+ To create a locked variant of the app that cannot switch profiles, edit `src/reachy_mini_conversation_app/config.py` and set the `LOCKED_PROFILE` constant to the desired profile name:
286
+ ```python
287
+ LOCKED_PROFILE: str | None = "mars_rover" # Lock to this profile
288
+ ```
289
+ When `LOCKED_PROFILE` is set, the app always uses that profile, ignoring `REACHY_MINI_CUSTOM_PROFILE` env var & the Gradio UI shows "(locked)" and disables all profile editing controls.
290
+ This is useful for creating dedicated clones of the app with a fixed personality. Clone scripts can simply edit this constant to lock the variant.
291
+
292
+ </details>
293
+
294
+ <details>
295
+ <summary><b>External profiles and tools</b></summary>
296
+
297
+ You can extend the app with profiles/tools stored outside `src/reachy_mini_conversation_app/`.
298
+
299
+ - Core profiles are under `src/reachy_mini_conversation_app/profiles/`.
300
+ - Core tools are under `src/reachy_mini_conversation_app/tools/`.
301
+
302
+ **Recommended layout:**
303
+
304
+ ```text
305
+ external_content/
306
+ ├── external_profiles/
307
+ │ └── my_profile/
308
+ │ ├── instructions.txt
309
+ │ ├── tools.txt # optional (see fallback behavior below)
310
+ │ └── voice.txt # optional
311
+ └── external_tools/
312
+ └── my_custom_tool.py
313
+ ```
314
+
315
+ **Environment variables:**
316
+
317
+ Set these values in your `.env` (copy from `.env.example`):
318
+
319
+ ```env
320
+ REACHY_MINI_CUSTOM_PROFILE=my_profile
321
+ REACHY_MINI_EXTERNAL_PROFILES_DIRECTORY=./external_content/external_profiles
322
+ REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY=./external_content/external_tools
323
+ # Optional convenience mode:
324
+ # AUTOLOAD_EXTERNAL_TOOLS=1
325
+ ```
326
+
327
+ **Loading behavior:**
328
+
329
+ - **Default/strict mode**: `tools.txt` defines enabled tools explicitly. Every name in `tools.txt` must resolve to either a built-in tool (`src/reachy_mini_conversation_app/tools/`) or an external tool module in `REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY`.
330
+ - **Convenience mode** (`AUTOLOAD_EXTERNAL_TOOLS=1`): all valid `*.py` tool files in `REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY` are auto-added.
331
+ - **External profile fallback**: if the selected external profile has no `tools.txt`, the app falls back to built-in `profiles/default/tools.txt`.
332
+
333
+ This supports both:
334
+ 1. Downloaded external tools used with built-in/default profile.
335
+ 2. Downloaded external profiles used with built-in default tools.
336
+
337
+ </details>
338
+
339
+ <details>
340
+ <summary><b>Multiple robots on the same subnet</b></summary>
341
+
342
+ If you run multiple Reachy Mini daemons on the same network, use:
343
+
344
+ ```bash
345
+ reachy-mini-conversation-app --robot-name <name>
346
+ ```
347
+
348
+ `<name>` must match the daemon's `--robot-name` value so the app connects to the correct robot.
349
+
350
+ </details>
351
+
352
+ ## Contributing
353
+
354
+ We welcome bug fixes, features, profiles, and documentation improvements. Please review our
355
+ [contribution guide](CONTRIBUTING.md) for branch conventions, quality checks, and PR workflow.
356
+
357
+ Quick start:
358
+ - Fork and clone the repo
359
+ - Follow the [installation steps](#installation) (include the `dev` dependency group)
360
+ - Run contributor checks listed in [CONTRIBUTING.md](CONTRIBUTING.md)
361
+
362
+ ## License
363
+
364
+ Apache 2.0
src/reachy_mini_conversation_app.egg-info/SOURCES.txt ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ LICENSE
2
+ README.md
3
+ pyproject.toml
4
+ src/reachy_mini_conversation_app/__init__.py
5
+ src/reachy_mini_conversation_app/camera_worker.py
6
+ src/reachy_mini_conversation_app/config.py
7
+ src/reachy_mini_conversation_app/console.py
8
+ src/reachy_mini_conversation_app/dance_emotion_moves.py
9
+ src/reachy_mini_conversation_app/gradio_personality.py
10
+ src/reachy_mini_conversation_app/headless_personality.py
11
+ src/reachy_mini_conversation_app/headless_personality_ui.py
12
+ src/reachy_mini_conversation_app/main.py
13
+ src/reachy_mini_conversation_app/moves.py
14
+ src/reachy_mini_conversation_app/ollama_handler.py
15
+ src/reachy_mini_conversation_app/prompts.py
16
+ src/reachy_mini_conversation_app/utils.py
17
+ src/reachy_mini_conversation_app.egg-info/PKG-INFO
18
+ src/reachy_mini_conversation_app.egg-info/SOURCES.txt
19
+ src/reachy_mini_conversation_app.egg-info/dependency_links.txt
20
+ src/reachy_mini_conversation_app.egg-info/entry_points.txt
21
+ src/reachy_mini_conversation_app.egg-info/requires.txt
22
+ src/reachy_mini_conversation_app.egg-info/top_level.txt
23
+ src/reachy_mini_conversation_app/audio/__init__.py
24
+ src/reachy_mini_conversation_app/audio/head_wobbler.py
25
+ src/reachy_mini_conversation_app/audio/speech_tapper.py
26
+ src/reachy_mini_conversation_app/images/reachymini_avatar.png
27
+ src/reachy_mini_conversation_app/images/user_avatar.png
28
+ src/reachy_mini_conversation_app/profiles/__init__.py
29
+ src/reachy_mini_conversation_app/profiles/cosmic_kitchen/instructions.txt
30
+ src/reachy_mini_conversation_app/profiles/cosmic_kitchen/tools.txt
31
+ src/reachy_mini_conversation_app/profiles/default/instructions.txt
32
+ src/reachy_mini_conversation_app/profiles/default/tools.txt
33
+ src/reachy_mini_conversation_app/profiles/example/instructions.txt
34
+ src/reachy_mini_conversation_app/profiles/example/sweep_look.py
35
+ src/reachy_mini_conversation_app/profiles/example/tools.txt
36
+ src/reachy_mini_conversation_app/profiles/mars_rover/instructions.txt
37
+ src/reachy_mini_conversation_app/profiles/mars_rover/tools.txt
38
+ src/reachy_mini_conversation_app/profiles/short_bored_teenager/instructions.txt
39
+ src/reachy_mini_conversation_app/profiles/short_bored_teenager/tools.txt
40
+ src/reachy_mini_conversation_app/profiles/short_captain_circuit/instructions.txt
41
+ src/reachy_mini_conversation_app/profiles/short_captain_circuit/tools.txt
42
+ src/reachy_mini_conversation_app/profiles/short_chess_coach/instructions.txt
43
+ src/reachy_mini_conversation_app/profiles/short_chess_coach/tools.txt
44
+ src/reachy_mini_conversation_app/profiles/short_hype_bot/instructions.txt
45
+ src/reachy_mini_conversation_app/profiles/short_hype_bot/tools.txt
46
+ src/reachy_mini_conversation_app/profiles/short_mad_scientist_assistant/instructions.txt
47
+ src/reachy_mini_conversation_app/profiles/short_mad_scientist_assistant/tools.txt
48
+ src/reachy_mini_conversation_app/profiles/short_nature_documentarian/instructions.txt
49
+ src/reachy_mini_conversation_app/profiles/short_nature_documentarian/tools.txt
50
+ src/reachy_mini_conversation_app/profiles/short_noir_detective/instructions.txt
51
+ src/reachy_mini_conversation_app/profiles/short_noir_detective/tools.txt
52
+ src/reachy_mini_conversation_app/profiles/short_time_traveler/instructions.txt
53
+ src/reachy_mini_conversation_app/profiles/short_time_traveler/tools.txt
54
+ src/reachy_mini_conversation_app/profiles/short_victorian_butler/instructions.txt
55
+ src/reachy_mini_conversation_app/profiles/short_victorian_butler/tools.txt
56
+ src/reachy_mini_conversation_app/profiles/sorry_bro/instructions.txt
57
+ src/reachy_mini_conversation_app/profiles/sorry_bro/tools.txt
58
+ src/reachy_mini_conversation_app/prompts/default_prompt.txt
59
+ src/reachy_mini_conversation_app/prompts/passion_for_lobster_jokes.txt
60
+ src/reachy_mini_conversation_app/prompts/behaviors/silent_robot.txt
61
+ src/reachy_mini_conversation_app/prompts/identities/basic_info.txt
62
+ src/reachy_mini_conversation_app/prompts/identities/witty_identity.txt
63
+ src/reachy_mini_conversation_app/static/index.html
64
+ src/reachy_mini_conversation_app/static/main.js
65
+ src/reachy_mini_conversation_app/static/style.css
66
+ src/reachy_mini_conversation_app/tools/__init__.py
67
+ src/reachy_mini_conversation_app/tools/camera.py
68
+ src/reachy_mini_conversation_app/tools/core_tools.py
69
+ src/reachy_mini_conversation_app/tools/dance.py
70
+ src/reachy_mini_conversation_app/tools/do_nothing.py
71
+ src/reachy_mini_conversation_app/tools/head_tracking.py
72
+ src/reachy_mini_conversation_app/tools/move_head.py
73
+ src/reachy_mini_conversation_app/tools/play_emotion.py
74
+ src/reachy_mini_conversation_app/tools/stop_dance.py
75
+ src/reachy_mini_conversation_app/tools/stop_emotion.py
76
+ src/reachy_mini_conversation_app/vision/__init__.py
77
+ src/reachy_mini_conversation_app/vision/processors.py
78
+ src/reachy_mini_conversation_app/vision/yolo_head_tracker.py
79
+ tests/test_config_name_collisions.py
80
+ tests/test_external_loading.py
81
+ tests/test_ollama_handler.py
src/reachy_mini_conversation_app.egg-info/dependency_links.txt ADDED
@@ -0,0 +1 @@
 
 
1
+
src/reachy_mini_conversation_app.egg-info/entry_points.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ [console_scripts]
2
+ reachy-mini-conversation-app = reachy_mini_conversation_app.main:main
3
+
4
+ [reachy_mini_apps]
5
+ reachy_mini_conversation_app = reachy_mini_conversation_app.main:ReachyMiniConversationApp
src/reachy_mini_conversation_app.egg-info/requires.txt ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ aiortc>=1.13.0
2
+ fastrtc>=0.0.34
3
+ gradio==5.50.1.dev1
4
+ huggingface-hub==1.3.0
5
+ opencv-python>=4.12.0.88
6
+ python-dotenv
7
+ ollama>=0.4
8
+ faster-whisper>=1.0
9
+ edge-tts>=7.0
10
+ miniaudio>=1.60
11
+ reachy_mini_dances_library
12
+ reachy_mini_toolbox
13
+ reachy-mini>=1.3.1
14
+ eclipse-zenoh~=1.7.0
15
+ gradio_client>=1.13.3
16
+
17
+ [all_vision]
18
+ torch>=2.1
19
+ transformers==5.0.0rc2
20
+ num2words
21
+ ultralytics
22
+ supervision
23
+ mediapipe==0.10.14
24
+
25
+ [local_vision]
26
+ torch>=2.1
27
+ transformers==5.0.0rc2
28
+ num2words
29
+
30
+ [mediapipe_vision]
31
+ mediapipe==0.10.14
32
+
33
+ [reachy_mini_wireless]
34
+ PyGObject<=3.46.0,>=3.42.2
35
+ gst-signalling>=1.1.2
36
+
37
+ [yolo_vision]
38
+ ultralytics
39
+ supervision
src/reachy_mini_conversation_app.egg-info/top_level.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ reachy_mini_conversation_app
src/reachy_mini_conversation_app/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Nothing (for ruff)."""
src/reachy_mini_conversation_app/audio/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Nothing (for ruff)."""
src/reachy_mini_conversation_app/audio/head_wobbler.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Moves head given audio samples."""
2
+
3
+ import time
4
+ import queue
5
+ import base64
6
+ import logging
7
+ import threading
8
+ from typing import Tuple
9
+ from collections.abc import Callable
10
+
11
+ import numpy as np
12
+ from numpy.typing import NDArray
13
+
14
+ from reachy_mini_conversation_app.audio.speech_tapper import HOP_MS, SwayRollRT
15
+
16
+
17
+ SAMPLE_RATE = 24000
18
+ MOVEMENT_LATENCY_S = 0.2 # seconds between audio and robot movement
19
+ logger = logging.getLogger(__name__)
20
+
21
+
22
+ class HeadWobbler:
23
+ """Converts audio deltas (base64) into head movement offsets."""
24
+
25
+ def __init__(self, set_speech_offsets: Callable[[Tuple[float, float, float, float, float, float]], None]) -> None:
26
+ """Initialize the head wobbler."""
27
+ self._apply_offsets = set_speech_offsets
28
+ self._base_ts: float | None = None
29
+ self._hops_done: int = 0
30
+
31
+ self.audio_queue: "queue.Queue[Tuple[int, int, NDArray[np.int16]]]" = queue.Queue()
32
+ self.sway = SwayRollRT()
33
+
34
+ # Synchronization primitives
35
+ self._state_lock = threading.Lock()
36
+ self._sway_lock = threading.Lock()
37
+ self._generation = 0
38
+
39
+ self._stop_event = threading.Event()
40
+ self._thread: threading.Thread | None = None
41
+
42
+ def feed(self, delta_b64: str) -> None:
43
+ """Thread-safe: push audio into the consumer queue."""
44
+ buf = np.frombuffer(base64.b64decode(delta_b64), dtype=np.int16).reshape(1, -1)
45
+ with self._state_lock:
46
+ generation = self._generation
47
+ self.audio_queue.put((generation, SAMPLE_RATE, buf))
48
+
49
+ def start(self) -> None:
50
+ """Start the head wobbler loop in a thread."""
51
+ self._stop_event.clear()
52
+ self._thread = threading.Thread(target=self.working_loop, daemon=True)
53
+ self._thread.start()
54
+ logger.debug("Head wobbler started")
55
+
56
+ def stop(self) -> None:
57
+ """Stop the head wobbler loop."""
58
+ self._stop_event.set()
59
+ if self._thread is not None:
60
+ self._thread.join()
61
+ logger.debug("Head wobbler stopped")
62
+
63
+ def working_loop(self) -> None:
64
+ """Convert audio deltas into head movement offsets."""
65
+ hop_dt = HOP_MS / 1000.0
66
+
67
+ logger.debug("Head wobbler thread started")
68
+ while not self._stop_event.is_set():
69
+ queue_ref = self.audio_queue
70
+ try:
71
+ chunk_generation, sr, chunk = queue_ref.get_nowait() # (gen, sr, data)
72
+ except queue.Empty:
73
+ # avoid while to never exit
74
+ time.sleep(MOVEMENT_LATENCY_S)
75
+ continue
76
+
77
+ try:
78
+ with self._state_lock:
79
+ current_generation = self._generation
80
+ if chunk_generation != current_generation:
81
+ continue
82
+
83
+ if self._base_ts is None:
84
+ with self._state_lock:
85
+ if self._base_ts is None:
86
+ self._base_ts = time.monotonic()
87
+
88
+ pcm = np.asarray(chunk).squeeze(0)
89
+ with self._sway_lock:
90
+ results = self.sway.feed(pcm, sr)
91
+
92
+ i = 0
93
+ while i < len(results):
94
+ with self._state_lock:
95
+ if self._generation != current_generation:
96
+ break
97
+ base_ts = self._base_ts
98
+ hops_done = self._hops_done
99
+
100
+ if base_ts is None:
101
+ base_ts = time.monotonic()
102
+ with self._state_lock:
103
+ if self._base_ts is None:
104
+ self._base_ts = base_ts
105
+ hops_done = self._hops_done
106
+
107
+ target = base_ts + MOVEMENT_LATENCY_S + hops_done * hop_dt
108
+ now = time.monotonic()
109
+
110
+ if now - target >= hop_dt:
111
+ lag_hops = int((now - target) / hop_dt)
112
+ drop = min(lag_hops, len(results) - i - 1)
113
+ if drop > 0:
114
+ with self._state_lock:
115
+ self._hops_done += drop
116
+ hops_done = self._hops_done
117
+ i += drop
118
+ continue
119
+
120
+ if target > now:
121
+ time.sleep(target - now)
122
+ with self._state_lock:
123
+ if self._generation != current_generation:
124
+ break
125
+
126
+ r = results[i]
127
+ offsets = (
128
+ r["x_mm"] / 1000.0,
129
+ r["y_mm"] / 1000.0,
130
+ r["z_mm"] / 1000.0,
131
+ r["roll_rad"],
132
+ r["pitch_rad"],
133
+ r["yaw_rad"],
134
+ )
135
+
136
+ with self._state_lock:
137
+ if self._generation != current_generation:
138
+ break
139
+
140
+ self._apply_offsets(offsets)
141
+
142
+ with self._state_lock:
143
+ self._hops_done += 1
144
+ i += 1
145
+ finally:
146
+ queue_ref.task_done()
147
+ logger.debug("Head wobbler thread exited")
148
+
149
+ '''
150
+ def drain_audio_queue(self) -> None:
151
+ """Empty the audio queue."""
152
+ try:
153
+ while True:
154
+ self.audio_queue.get_nowait()
155
+ except QueueEmpty:
156
+ pass
157
+ '''
158
+
159
+ def reset(self) -> None:
160
+ """Reset the internal state."""
161
+ with self._state_lock:
162
+ self._generation += 1
163
+ self._base_ts = None
164
+ self._hops_done = 0
165
+
166
+ # Drain any queued audio chunks from previous generations
167
+ drained_any = False
168
+ while True:
169
+ try:
170
+ _, _, _ = self.audio_queue.get_nowait()
171
+ except queue.Empty:
172
+ break
173
+ else:
174
+ drained_any = True
175
+ self.audio_queue.task_done()
176
+
177
+ with self._sway_lock:
178
+ self.sway.reset()
179
+
180
+ if drained_any:
181
+ logger.debug("Head wobbler queue drained during reset")
src/reachy_mini_conversation_app/audio/speech_tapper.py ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import math
3
+ from typing import Any, Dict, List
4
+ from itertools import islice
5
+ from collections import deque
6
+
7
+ import numpy as np
8
+ from numpy.typing import NDArray
9
+
10
+
11
+ # Tunables
12
+ SR = 16_000
13
+ FRAME_MS = 20
14
+ HOP_MS = 50
15
+
16
+ SWAY_MASTER = 1.5
17
+ SENS_DB_OFFSET = +4.0
18
+ VAD_DB_ON = -35.0
19
+ VAD_DB_OFF = -45.0
20
+ VAD_ATTACK_MS = 40
21
+ VAD_RELEASE_MS = 250
22
+ ENV_FOLLOW_GAIN = 0.65
23
+
24
+ SWAY_F_PITCH = 2.2
25
+ SWAY_A_PITCH_DEG = 4.5
26
+ SWAY_F_YAW = 0.6
27
+ SWAY_A_YAW_DEG = 7.5
28
+ SWAY_F_ROLL = 1.3
29
+ SWAY_A_ROLL_DEG = 2.25
30
+ SWAY_F_X = 0.35
31
+ SWAY_A_X_MM = 4.5
32
+ SWAY_F_Y = 0.45
33
+ SWAY_A_Y_MM = 3.75
34
+ SWAY_F_Z = 0.25
35
+ SWAY_A_Z_MM = 2.25
36
+
37
+ SWAY_DB_LOW = -46.0
38
+ SWAY_DB_HIGH = -18.0
39
+ LOUDNESS_GAMMA = 0.9
40
+ SWAY_ATTACK_MS = 50
41
+ SWAY_RELEASE_MS = 250
42
+
43
+ # Derived
44
+ FRAME = int(SR * FRAME_MS / 1000)
45
+ HOP = int(SR * HOP_MS / 1000)
46
+ ATTACK_FR = max(1, int(VAD_ATTACK_MS / HOP_MS))
47
+ RELEASE_FR = max(1, int(VAD_RELEASE_MS / HOP_MS))
48
+ SWAY_ATTACK_FR = max(1, int(SWAY_ATTACK_MS / HOP_MS))
49
+ SWAY_RELEASE_FR = max(1, int(SWAY_RELEASE_MS / HOP_MS))
50
+
51
+
52
+ def _rms_dbfs(x: NDArray[np.float32]) -> float:
53
+ """Root-mean-square in dBFS for float32 mono array in [-1,1]."""
54
+ # numerically stable rms (avoid overflow)
55
+ x = x.astype(np.float32, copy=False)
56
+ rms = np.sqrt(np.mean(x * x, dtype=np.float32) + 1e-12, dtype=np.float32)
57
+ return float(20.0 * math.log10(float(rms) + 1e-12))
58
+
59
+
60
+ def _loudness_gain(db: float, offset: float = SENS_DB_OFFSET) -> float:
61
+ """Normalize dB into [0,1] with gamma; clipped to [0,1]."""
62
+ t = (db + offset - SWAY_DB_LOW) / (SWAY_DB_HIGH - SWAY_DB_LOW)
63
+ if t < 0.0:
64
+ t = 0.0
65
+ elif t > 1.0:
66
+ t = 1.0
67
+ return t**LOUDNESS_GAMMA if LOUDNESS_GAMMA != 1.0 else t
68
+
69
+
70
+ def _to_float32_mono(x: NDArray[Any]) -> NDArray[np.float32]:
71
+ """Convert arbitrary PCM array to float32 mono in [-1,1].
72
+
73
+ Accepts shapes: (N,), (1,N), (N,1), (C,N), (N,C).
74
+ """
75
+ a = np.asarray(x)
76
+ if a.ndim == 0:
77
+ return np.zeros(0, dtype=np.float32)
78
+
79
+ # If 2D, decide which axis is channels (prefer small first dim)
80
+ if a.ndim == 2:
81
+ # e.g., (channels, samples) if channels is small (<=8)
82
+ if a.shape[0] <= 8 and a.shape[0] <= a.shape[1]:
83
+ a = np.mean(a, axis=0)
84
+ else:
85
+ a = np.mean(a, axis=1)
86
+ elif a.ndim > 2:
87
+ a = np.mean(a.reshape(a.shape[0], -1), axis=0)
88
+
89
+ # Now 1D, cast/scale
90
+ if np.issubdtype(a.dtype, np.floating):
91
+ return a.astype(np.float32, copy=False)
92
+ # integer PCM
93
+ info = np.iinfo(a.dtype)
94
+ scale = float(max(-info.min, info.max))
95
+ return a.astype(np.float32) / (scale if scale != 0.0 else 1.0)
96
+
97
+
98
+ def _resample_linear(x: NDArray[np.float32], sr_in: int, sr_out: int) -> NDArray[np.float32]:
99
+ """Lightweight linear resampler for short buffers."""
100
+ if sr_in == sr_out or x.size == 0:
101
+ return x
102
+ # guard tiny sizes
103
+ n_out = int(round(x.size * sr_out / sr_in))
104
+ if n_out <= 1:
105
+ return np.zeros(0, dtype=np.float32)
106
+ t_in = np.linspace(0.0, 1.0, num=x.size, dtype=np.float32, endpoint=True)
107
+ t_out = np.linspace(0.0, 1.0, num=n_out, dtype=np.float32, endpoint=True)
108
+ return np.interp(t_out, t_in, x).astype(np.float32, copy=False)
109
+
110
+
111
+ class SwayRollRT:
112
+ """Feed audio chunks → per-hop sway outputs.
113
+
114
+ Usage:
115
+ rt = SwayRollRT()
116
+ rt.feed(pcm_int16_or_float, sr) -> List[dict]
117
+ """
118
+
119
+ def __init__(self, rng_seed: int = 7):
120
+ """Initialize state."""
121
+ self._seed = int(rng_seed)
122
+ self.samples: deque[float] = deque(maxlen=10 * SR) # sliding window for VAD/env
123
+ self.carry: NDArray[np.float32] = np.zeros(0, dtype=np.float32)
124
+
125
+ self.vad_on = False
126
+ self.vad_above = 0
127
+ self.vad_below = 0
128
+
129
+ self.sway_env = 0.0
130
+ self.sway_up = 0
131
+ self.sway_down = 0
132
+
133
+ rng = np.random.default_rng(self._seed)
134
+ self.phase_pitch = float(rng.random() * 2 * math.pi)
135
+ self.phase_yaw = float(rng.random() * 2 * math.pi)
136
+ self.phase_roll = float(rng.random() * 2 * math.pi)
137
+ self.phase_x = float(rng.random() * 2 * math.pi)
138
+ self.phase_y = float(rng.random() * 2 * math.pi)
139
+ self.phase_z = float(rng.random() * 2 * math.pi)
140
+ self.t = 0.0
141
+
142
+ def reset(self) -> None:
143
+ """Reset state (VAD/env/buffers/time) but keep initial phases/seed."""
144
+ self.samples.clear()
145
+ self.carry = np.zeros(0, dtype=np.float32)
146
+ self.vad_on = False
147
+ self.vad_above = 0
148
+ self.vad_below = 0
149
+ self.sway_env = 0.0
150
+ self.sway_up = 0
151
+ self.sway_down = 0
152
+ self.t = 0.0
153
+
154
+ def feed(self, pcm: NDArray[Any], sr: int | None) -> List[Dict[str, float]]:
155
+ """Stream in PCM chunk. Returns a list of sway dicts, one per hop (HOP_MS).
156
+
157
+ Args:
158
+ pcm: np.ndarray, shape (N,) or (C,N)/(N,C); int or float.
159
+ sr: sample rate of `pcm` (None -> assume SR).
160
+
161
+ """
162
+ sr_in = SR if sr is None else int(sr)
163
+ x = _to_float32_mono(pcm)
164
+ if x.size == 0:
165
+ return []
166
+ if sr_in != SR:
167
+ x = _resample_linear(x, sr_in, SR)
168
+ if x.size == 0:
169
+ return []
170
+
171
+ # append to carry and consume fixed HOP chunks
172
+ if self.carry.size:
173
+ self.carry = np.concatenate([self.carry, x])
174
+ else:
175
+ self.carry = x
176
+
177
+ out: List[Dict[str, float]] = []
178
+
179
+ while self.carry.size >= HOP:
180
+ hop = self.carry[:HOP]
181
+ remaining: NDArray[np.float32] = self.carry[HOP:]
182
+ self.carry = remaining
183
+
184
+ # keep sliding window for VAD/env computation
185
+ # (deque accepts any iterable; list() for small HOP is fine)
186
+ self.samples.extend(hop.tolist())
187
+ if len(self.samples) < FRAME:
188
+ self.t += HOP_MS / 1000.0
189
+ continue
190
+
191
+ frame = np.fromiter(
192
+ islice(self.samples, len(self.samples) - FRAME, len(self.samples)),
193
+ dtype=np.float32,
194
+ count=FRAME,
195
+ )
196
+ db = _rms_dbfs(frame)
197
+
198
+ # VAD with hysteresis + attack/release
199
+ if db >= VAD_DB_ON:
200
+ self.vad_above += 1
201
+ self.vad_below = 0
202
+ if not self.vad_on and self.vad_above >= ATTACK_FR:
203
+ self.vad_on = True
204
+ elif db <= VAD_DB_OFF:
205
+ self.vad_below += 1
206
+ self.vad_above = 0
207
+ if self.vad_on and self.vad_below >= RELEASE_FR:
208
+ self.vad_on = False
209
+
210
+ if self.vad_on:
211
+ self.sway_up = min(SWAY_ATTACK_FR, self.sway_up + 1)
212
+ self.sway_down = 0
213
+ else:
214
+ self.sway_down = min(SWAY_RELEASE_FR, self.sway_down + 1)
215
+ self.sway_up = 0
216
+
217
+ up = self.sway_up / SWAY_ATTACK_FR
218
+ down = 1.0 - (self.sway_down / SWAY_RELEASE_FR)
219
+ target = up if self.vad_on else down
220
+ self.sway_env += ENV_FOLLOW_GAIN * (target - self.sway_env)
221
+ # clamp
222
+ if self.sway_env < 0.0:
223
+ self.sway_env = 0.0
224
+ elif self.sway_env > 1.0:
225
+ self.sway_env = 1.0
226
+
227
+ loud = _loudness_gain(db) * SWAY_MASTER
228
+ env = self.sway_env
229
+ self.t += HOP_MS / 1000.0
230
+
231
+ # oscillators
232
+ pitch = (
233
+ math.radians(SWAY_A_PITCH_DEG)
234
+ * loud
235
+ * env
236
+ * math.sin(2 * math.pi * SWAY_F_PITCH * self.t + self.phase_pitch)
237
+ )
238
+ yaw = (
239
+ math.radians(SWAY_A_YAW_DEG)
240
+ * loud
241
+ * env
242
+ * math.sin(2 * math.pi * SWAY_F_YAW * self.t + self.phase_yaw)
243
+ )
244
+ roll = (
245
+ math.radians(SWAY_A_ROLL_DEG)
246
+ * loud
247
+ * env
248
+ * math.sin(2 * math.pi * SWAY_F_ROLL * self.t + self.phase_roll)
249
+ )
250
+ x_mm = SWAY_A_X_MM * loud * env * math.sin(2 * math.pi * SWAY_F_X * self.t + self.phase_x)
251
+ y_mm = SWAY_A_Y_MM * loud * env * math.sin(2 * math.pi * SWAY_F_Y * self.t + self.phase_y)
252
+ z_mm = SWAY_A_Z_MM * loud * env * math.sin(2 * math.pi * SWAY_F_Z * self.t + self.phase_z)
253
+
254
+ out.append(
255
+ {
256
+ "pitch_rad": pitch,
257
+ "yaw_rad": yaw,
258
+ "roll_rad": roll,
259
+ "pitch_deg": math.degrees(pitch),
260
+ "yaw_deg": math.degrees(yaw),
261
+ "roll_deg": math.degrees(roll),
262
+ "x_mm": x_mm,
263
+ "y_mm": y_mm,
264
+ "z_mm": z_mm,
265
+ },
266
+ )
267
+
268
+ return out
src/reachy_mini_conversation_app/camera_worker.py ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Camera worker thread with frame buffering and face tracking.
2
+
3
+ Ported from main_works.py camera_worker() function to provide:
4
+ - 30Hz+ camera polling with thread-safe frame buffering
5
+ - Face tracking integration with smooth interpolation
6
+ - Latest frame always available for tools
7
+ """
8
+
9
+ import time
10
+ import logging
11
+ import threading
12
+ from typing import Any, List, Tuple
13
+
14
+ import numpy as np
15
+ from numpy.typing import NDArray
16
+ from scipy.spatial.transform import Rotation as R
17
+
18
+ from reachy_mini import ReachyMini
19
+ from reachy_mini.utils.interpolation import linear_pose_interpolation
20
+
21
+
22
+ logger = logging.getLogger(__name__)
23
+
24
+
25
+ class CameraWorker:
26
+ """Thread-safe camera worker with frame buffering and face tracking."""
27
+
28
+ def __init__(self, reachy_mini: ReachyMini, head_tracker: Any = None) -> None:
29
+ """Initialize."""
30
+ self.reachy_mini = reachy_mini
31
+ self.head_tracker = head_tracker
32
+
33
+ # Thread-safe frame storage
34
+ self.latest_frame: NDArray[np.uint8] | None = None
35
+ self.frame_lock = threading.Lock()
36
+ self._stop_event = threading.Event()
37
+ self._thread: threading.Thread | None = None
38
+
39
+ # Face tracking state
40
+ self.is_head_tracking_enabled = True
41
+ self.face_tracking_offsets: List[float] = [
42
+ 0.0,
43
+ 0.0,
44
+ 0.0,
45
+ 0.0,
46
+ 0.0,
47
+ 0.0,
48
+ ] # x, y, z, roll, pitch, yaw
49
+ self.face_tracking_lock = threading.Lock()
50
+
51
+ # Face tracking timing variables (same as main_works.py)
52
+ self.last_face_detected_time: float | None = None
53
+ self.interpolation_start_time: float | None = None
54
+ self.interpolation_start_pose: NDArray[np.float32] | None = None
55
+ self.face_lost_delay = 2.0 # seconds to wait before starting interpolation
56
+ self.interpolation_duration = 1.0 # seconds to interpolate back to neutral
57
+
58
+ # Track state changes
59
+ self.previous_head_tracking_state = self.is_head_tracking_enabled
60
+
61
+ def get_latest_frame(self) -> NDArray[np.uint8] | None:
62
+ """Get the latest frame (thread-safe)."""
63
+ with self.frame_lock:
64
+ if self.latest_frame is None:
65
+ return None
66
+ # Return a copy in original BGR format (OpenCV native)
67
+ return self.latest_frame.copy()
68
+
69
+ def get_face_tracking_offsets(
70
+ self,
71
+ ) -> Tuple[float, float, float, float, float, float]:
72
+ """Get current face tracking offsets (thread-safe)."""
73
+ with self.face_tracking_lock:
74
+ offsets = self.face_tracking_offsets
75
+ return (offsets[0], offsets[1], offsets[2], offsets[3], offsets[4], offsets[5])
76
+
77
+ def set_head_tracking_enabled(self, enabled: bool) -> None:
78
+ """Enable/disable head tracking."""
79
+ self.is_head_tracking_enabled = enabled
80
+ logger.info(f"Head tracking {'enabled' if enabled else 'disabled'}")
81
+
82
+ def start(self) -> None:
83
+ """Start the camera worker loop in a thread."""
84
+ self._stop_event.clear()
85
+ self._thread = threading.Thread(target=self.working_loop, daemon=True)
86
+ self._thread.start()
87
+ logger.debug("Camera worker started")
88
+
89
+ def stop(self) -> None:
90
+ """Stop the camera worker loop."""
91
+ self._stop_event.set()
92
+ if self._thread is not None:
93
+ self._thread.join()
94
+
95
+ logger.debug("Camera worker stopped")
96
+
97
+ def working_loop(self) -> None:
98
+ """Enable the camera worker loop.
99
+
100
+ Ported from main_works.py camera_worker() with same logic.
101
+ """
102
+ logger.debug("Starting camera working loop")
103
+
104
+ # Initialize head tracker if available
105
+ neutral_pose = np.eye(4) # Neutral pose (identity matrix)
106
+ self.previous_head_tracking_state = self.is_head_tracking_enabled
107
+
108
+ while not self._stop_event.is_set():
109
+ try:
110
+ current_time = time.time()
111
+
112
+ # Get frame from robot
113
+ frame = self.reachy_mini.media.get_frame()
114
+
115
+ if frame is not None:
116
+ # Thread-safe frame storage
117
+ with self.frame_lock:
118
+ self.latest_frame = frame # .copy()
119
+
120
+ # Check if face tracking was just disabled
121
+ if self.previous_head_tracking_state and not self.is_head_tracking_enabled:
122
+ # Face tracking was just disabled - start interpolation to neutral
123
+ self.last_face_detected_time = current_time # Trigger the face-lost logic
124
+ self.interpolation_start_time = None # Will be set by the face-lost interpolation
125
+ self.interpolation_start_pose = None
126
+
127
+ # Update tracking state
128
+ self.previous_head_tracking_state = self.is_head_tracking_enabled
129
+
130
+ # Handle face tracking if enabled and head tracker available
131
+ if self.is_head_tracking_enabled and self.head_tracker is not None:
132
+ eye_center, _ = self.head_tracker.get_head_position(frame)
133
+
134
+ if eye_center is not None:
135
+ # Face detected - immediately switch to tracking
136
+ self.last_face_detected_time = current_time
137
+ self.interpolation_start_time = None # Stop any interpolation
138
+
139
+ # Convert normalized coordinates to pixel coordinates
140
+ h, w, _ = frame.shape
141
+ eye_center_norm = (eye_center + 1) / 2
142
+ eye_center_pixels = [
143
+ eye_center_norm[0] * w,
144
+ eye_center_norm[1] * h,
145
+ ]
146
+
147
+ # Get the head pose needed to look at the target, but don't perform movement
148
+ target_pose = self.reachy_mini.look_at_image(
149
+ eye_center_pixels[0],
150
+ eye_center_pixels[1],
151
+ duration=0.0,
152
+ perform_movement=False,
153
+ )
154
+
155
+ # Extract translation and rotation from the target pose directly
156
+ translation = target_pose[:3, 3]
157
+ rotation = R.from_matrix(target_pose[:3, :3]).as_euler("xyz", degrees=False)
158
+
159
+ # Scale down translation and rotation because smaller FOV
160
+ translation *= 0.6
161
+ rotation *= 0.6
162
+
163
+ # Thread-safe update of face tracking offsets (use pose as-is)
164
+ with self.face_tracking_lock:
165
+ self.face_tracking_offsets = [
166
+ translation[0],
167
+ translation[1],
168
+ translation[2], # x, y, z
169
+ rotation[0],
170
+ rotation[1],
171
+ rotation[2], # roll, pitch, yaw
172
+ ]
173
+
174
+ # No face detected while tracking enabled - set face lost timestamp
175
+ elif self.last_face_detected_time is None or self.last_face_detected_time == current_time:
176
+ # Only update if we haven't already set a face lost time
177
+ # (current_time check prevents overriding the disable-triggered timestamp)
178
+ pass
179
+
180
+ # Handle smooth interpolation (works for both face-lost and tracking-disabled cases)
181
+ if self.last_face_detected_time is not None:
182
+ time_since_face_lost = current_time - self.last_face_detected_time
183
+
184
+ if time_since_face_lost >= self.face_lost_delay:
185
+ # Start interpolation if not already started
186
+ if self.interpolation_start_time is None:
187
+ self.interpolation_start_time = current_time
188
+ # Capture current pose as start of interpolation
189
+ with self.face_tracking_lock:
190
+ current_translation = self.face_tracking_offsets[:3]
191
+ current_rotation_euler = self.face_tracking_offsets[3:]
192
+ # Convert to 4x4 pose matrix
193
+ pose_matrix = np.eye(4, dtype=np.float32)
194
+ pose_matrix[:3, 3] = current_translation
195
+ pose_matrix[:3, :3] = R.from_euler(
196
+ "xyz",
197
+ current_rotation_euler,
198
+ ).as_matrix()
199
+ self.interpolation_start_pose = pose_matrix
200
+
201
+ # Calculate interpolation progress (t from 0 to 1)
202
+ elapsed_interpolation = current_time - self.interpolation_start_time
203
+ t = min(1.0, elapsed_interpolation / self.interpolation_duration)
204
+
205
+ # Interpolate between current pose and neutral pose
206
+ interpolated_pose = linear_pose_interpolation(
207
+ self.interpolation_start_pose,
208
+ neutral_pose,
209
+ t,
210
+ )
211
+
212
+ # Extract translation and rotation from interpolated pose
213
+ translation = interpolated_pose[:3, 3]
214
+ rotation = R.from_matrix(interpolated_pose[:3, :3]).as_euler("xyz", degrees=False)
215
+
216
+ # Thread-safe update of face tracking offsets
217
+ with self.face_tracking_lock:
218
+ self.face_tracking_offsets = [
219
+ translation[0],
220
+ translation[1],
221
+ translation[2], # x, y, z
222
+ rotation[0],
223
+ rotation[1],
224
+ rotation[2], # roll, pitch, yaw
225
+ ]
226
+
227
+ # If interpolation is complete, reset timing
228
+ if t >= 1.0:
229
+ self.last_face_detected_time = None
230
+ self.interpolation_start_time = None
231
+ self.interpolation_start_pose = None
232
+ # else: Keep current offsets (within 2s delay period)
233
+
234
+ # Small sleep to prevent excessive CPU usage (same as main_works.py)
235
+ time.sleep(0.04)
236
+
237
+ except Exception as e:
238
+ logger.error(f"Camera worker error: {e}")
239
+ time.sleep(0.1) # Longer sleep on error
240
+
241
+ logger.debug("Camera worker thread exited")
src/reachy_mini_conversation_app/config.py ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+ import logging
4
+ from pathlib import Path
5
+
6
+ from dotenv import find_dotenv, load_dotenv
7
+
8
+
9
+ # Locked profile: set to a profile name (e.g., "astronomer") to lock the app
10
+ # to that profile and disable all profile switching. Leave as None for normal behavior.
11
+ LOCKED_PROFILE: str | None = None
12
+ DEFAULT_PROFILES_DIRECTORY = Path(__file__).parent / "profiles"
13
+
14
+ logger = logging.getLogger(__name__)
15
+
16
+
17
+ def _env_flag(name: str, default: bool = False) -> bool:
18
+ """Parse a boolean environment flag.
19
+
20
+ Accepted truthy values: 1, true, yes, on
21
+ Accepted falsy values: 0, false, no, off
22
+ """
23
+ raw = os.getenv(name)
24
+ if raw is None:
25
+ return default
26
+
27
+ value = raw.strip().lower()
28
+ if value in {"1", "true", "yes", "on"}:
29
+ return True
30
+ if value in {"0", "false", "no", "off"}:
31
+ return False
32
+
33
+ logger.warning("Invalid boolean value for %s=%r, using default=%s", name, raw, default)
34
+ return default
35
+
36
+
37
+ def _collect_profile_names(profiles_root: Path) -> set[str]:
38
+ """Return profile folder names from a profiles root directory."""
39
+ if not profiles_root.exists() or not profiles_root.is_dir():
40
+ return set()
41
+ return {p.name for p in profiles_root.iterdir() if p.is_dir()}
42
+
43
+
44
+ def _collect_tool_module_names(tools_root: Path) -> set[str]:
45
+ """Return tool module names from a tools directory."""
46
+ if not tools_root.exists() or not tools_root.is_dir():
47
+ return set()
48
+ ignored = {"__init__", "core_tools"}
49
+ return {
50
+ p.stem
51
+ for p in tools_root.glob("*.py")
52
+ if p.is_file() and p.stem not in ignored
53
+ }
54
+
55
+
56
+ def _raise_on_name_collisions(
57
+ *,
58
+ label: str,
59
+ external_root: Path,
60
+ internal_root: Path,
61
+ external_names: set[str],
62
+ internal_names: set[str],
63
+ ) -> None:
64
+ """Raise with a clear message when external/internal names collide."""
65
+ collisions = sorted(external_names & internal_names)
66
+ if not collisions:
67
+ return
68
+
69
+ raise RuntimeError(
70
+ f"Config.__init__(): Ambiguous {label} names found in both external and built-in libraries: {collisions}. "
71
+ f"External {label} root: {external_root}. Built-in {label} root: {internal_root}. "
72
+ f"Please rename the conflicting external {label}(s) to continue."
73
+ )
74
+
75
+
76
+ # Validate LOCKED_PROFILE at startup
77
+ if LOCKED_PROFILE is not None:
78
+ _profiles_dir = DEFAULT_PROFILES_DIRECTORY
79
+ _profile_path = _profiles_dir / LOCKED_PROFILE
80
+ _instructions_file = _profile_path / "instructions.txt"
81
+ if not _profile_path.is_dir():
82
+ print(f"Error: LOCKED_PROFILE '{LOCKED_PROFILE}' does not exist in {_profiles_dir}", file=sys.stderr)
83
+ sys.exit(1)
84
+ if not _instructions_file.is_file():
85
+ print(f"Error: LOCKED_PROFILE '{LOCKED_PROFILE}' has no instructions.txt", file=sys.stderr)
86
+ sys.exit(1)
87
+
88
+ _skip_dotenv = _env_flag("REACHY_MINI_SKIP_DOTENV", default=False)
89
+
90
+ if _skip_dotenv:
91
+ logger.info("Skipping .env loading because REACHY_MINI_SKIP_DOTENV is set")
92
+ else:
93
+ # Locate .env file (search upward from current working directory)
94
+ dotenv_path = find_dotenv(usecwd=True)
95
+
96
+ if dotenv_path:
97
+ # Load .env and override environment variables
98
+ load_dotenv(dotenv_path=dotenv_path, override=True)
99
+ logger.info(f"Configuration loaded from {dotenv_path}")
100
+ else:
101
+ logger.warning("No .env file found, using environment variables")
102
+
103
+
104
+ class Config:
105
+ """Configuration class for the conversation app."""
106
+
107
+ # Ollama
108
+ OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
109
+ MODEL_NAME = os.getenv("MODEL_NAME", "llama3.2")
110
+
111
+ # STT (faster-whisper model size: tiny, base, small, medium, large-v3)
112
+ STT_MODEL = os.getenv("STT_MODEL", "base")
113
+
114
+ # TTS (edge-tts voice name)
115
+ TTS_VOICE = os.getenv("TTS_VOICE", "en-US-AriaNeural")
116
+
117
+ # Vision (optional, used with --local-vision CLI flag)
118
+ HF_HOME = os.getenv("HF_HOME", "./cache")
119
+ LOCAL_VISION_MODEL = os.getenv("LOCAL_VISION_MODEL", "HuggingFaceTB/SmolVLM2-2.2B-Instruct")
120
+ HF_TOKEN = os.getenv("HF_TOKEN") # Optional, falls back to hf auth login if not set
121
+
122
+ logger.debug(f"Model: {MODEL_NAME}, Ollama: {OLLAMA_BASE_URL}, STT: {STT_MODEL}, TTS: {TTS_VOICE}")
123
+
124
+ _profiles_directory_env = os.getenv("REACHY_MINI_EXTERNAL_PROFILES_DIRECTORY")
125
+ PROFILES_DIRECTORY = (
126
+ Path(_profiles_directory_env) if _profiles_directory_env else Path(__file__).parent / "profiles"
127
+ )
128
+ _tools_directory_env = os.getenv("REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY")
129
+ TOOLS_DIRECTORY = Path(_tools_directory_env) if _tools_directory_env else None
130
+ AUTOLOAD_EXTERNAL_TOOLS = _env_flag("AUTOLOAD_EXTERNAL_TOOLS", default=False)
131
+ REACHY_MINI_CUSTOM_PROFILE = LOCKED_PROFILE or os.getenv("REACHY_MINI_CUSTOM_PROFILE")
132
+
133
+ logger.debug(f"Custom Profile: {REACHY_MINI_CUSTOM_PROFILE}")
134
+
135
+ def __init__(self) -> None:
136
+ """Initialize the configuration."""
137
+ if self.REACHY_MINI_CUSTOM_PROFILE and self.PROFILES_DIRECTORY != DEFAULT_PROFILES_DIRECTORY:
138
+ selected_profile_path = self.PROFILES_DIRECTORY / self.REACHY_MINI_CUSTOM_PROFILE
139
+ if not selected_profile_path.is_dir():
140
+ available_profiles = sorted(_collect_profile_names(self.PROFILES_DIRECTORY))
141
+ raise RuntimeError(
142
+ "Config.__init__(): Selected profile "
143
+ f"'{self.REACHY_MINI_CUSTOM_PROFILE}' was not found in external profiles root "
144
+ f"{self.PROFILES_DIRECTORY}. "
145
+ f"Available external profiles: {available_profiles}. "
146
+ "Either set 'REACHY_MINI_CUSTOM_PROFILE' to one of the available external profiles "
147
+ "or unset 'REACHY_MINI_EXTERNAL_PROFILES_DIRECTORY' to use built-in profiles."
148
+ )
149
+
150
+ if self.PROFILES_DIRECTORY != DEFAULT_PROFILES_DIRECTORY:
151
+ external_profiles = _collect_profile_names(self.PROFILES_DIRECTORY)
152
+ internal_profiles = _collect_profile_names(DEFAULT_PROFILES_DIRECTORY)
153
+ _raise_on_name_collisions(
154
+ label="profile",
155
+ external_root=self.PROFILES_DIRECTORY,
156
+ internal_root=DEFAULT_PROFILES_DIRECTORY,
157
+ external_names=external_profiles,
158
+ internal_names=internal_profiles,
159
+ )
160
+
161
+ if self.TOOLS_DIRECTORY is not None:
162
+ builtin_tools_root = Path(__file__).parent / "tools"
163
+ external_tools = _collect_tool_module_names(self.TOOLS_DIRECTORY)
164
+ internal_tools = _collect_tool_module_names(builtin_tools_root)
165
+ _raise_on_name_collisions(
166
+ label="tool",
167
+ external_root=self.TOOLS_DIRECTORY,
168
+ internal_root=builtin_tools_root,
169
+ external_names=external_tools,
170
+ internal_names=internal_tools,
171
+ )
172
+
173
+ if self.PROFILES_DIRECTORY != DEFAULT_PROFILES_DIRECTORY:
174
+ logger.warning(
175
+ "Environment variable 'REACHY_MINI_EXTERNAL_PROFILES_DIRECTORY' is set. "
176
+ "Profiles (instructions.txt, ...) will be loaded from %s.",
177
+ self.PROFILES_DIRECTORY,
178
+ )
179
+ else:
180
+ logger.info(
181
+ "'REACHY_MINI_EXTERNAL_PROFILES_DIRECTORY' is not set. "
182
+ "Using built-in profiles from %s.",
183
+ DEFAULT_PROFILES_DIRECTORY,
184
+ )
185
+
186
+ if self.TOOLS_DIRECTORY is not None:
187
+ logger.warning(
188
+ "Environment variable 'REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY' is set. "
189
+ "External tools will be loaded from %s.",
190
+ self.TOOLS_DIRECTORY,
191
+ )
192
+ else:
193
+ logger.info(
194
+ "'REACHY_MINI_EXTERNAL_TOOLS_DIRECTORY' is not set. "
195
+ "Using built-in shared tools only."
196
+ )
197
+
198
+
199
+ config = Config()
200
+
201
+
202
+ def set_custom_profile(profile: str | None) -> None:
203
+ """Update the selected custom profile at runtime and expose it via env.
204
+
205
+ This ensures modules that read `config` and code that inspects the
206
+ environment see a consistent value.
207
+ """
208
+ if LOCKED_PROFILE is not None:
209
+ return
210
+ try:
211
+ config.REACHY_MINI_CUSTOM_PROFILE = profile
212
+ except Exception:
213
+ pass
214
+ try:
215
+ import os as _os
216
+
217
+ if profile:
218
+ _os.environ["REACHY_MINI_CUSTOM_PROFILE"] = profile
219
+ else:
220
+ # Remove to reflect default
221
+ _os.environ.pop("REACHY_MINI_CUSTOM_PROFILE", None)
222
+ except Exception:
223
+ pass
src/reachy_mini_conversation_app/console.py ADDED
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Bidirectional local audio stream with optional settings UI.
2
+
3
+ In headless mode, there is no Gradio UI. The app connects directly to the
4
+ Ollama server for LLM inference. If Ollama is not reachable, the settings
5
+ page shows the connection status.
6
+
7
+ The settings UI is served from this package's ``static/`` folder and offers
8
+ personality management. Once configured, streaming starts automatically.
9
+ """
10
+
11
+ import os
12
+ import sys
13
+ import time
14
+ import asyncio
15
+ import logging
16
+ from typing import List, Optional
17
+ from pathlib import Path
18
+
19
+ from fastrtc import AdditionalOutputs, audio_to_float32
20
+ from scipy.signal import resample
21
+
22
+ from reachy_mini import ReachyMini
23
+ from reachy_mini.media.media_manager import MediaBackend
24
+ from reachy_mini_conversation_app.config import LOCKED_PROFILE, config
25
+ from reachy_mini_conversation_app.ollama_handler import OllamaHandler
26
+ from reachy_mini_conversation_app.headless_personality_ui import mount_personality_routes
27
+
28
+
29
+ try:
30
+ # FastAPI is provided by the Reachy Mini Apps runtime
31
+ from fastapi import FastAPI, Response
32
+ from pydantic import BaseModel
33
+ from fastapi.responses import FileResponse, JSONResponse
34
+ from starlette.staticfiles import StaticFiles
35
+ except Exception: # pragma: no cover - only loaded when settings_app is used
36
+ FastAPI = object # type: ignore
37
+ FileResponse = object # type: ignore
38
+ JSONResponse = object # type: ignore
39
+ StaticFiles = object # type: ignore
40
+ BaseModel = object # type: ignore
41
+
42
+
43
+ logger = logging.getLogger(__name__)
44
+
45
+
46
+ class LocalStream:
47
+ """LocalStream using Reachy Mini's recorder/player."""
48
+
49
+ def __init__(
50
+ self,
51
+ handler: OllamaHandler,
52
+ robot: ReachyMini,
53
+ *,
54
+ settings_app: Optional[FastAPI] = None,
55
+ instance_path: Optional[str] = None,
56
+ ):
57
+ """Initialize the stream with an Ollama handler and pipelines.
58
+
59
+ - ``settings_app``: the Reachy Mini Apps FastAPI to attach settings endpoints.
60
+ - ``instance_path``: directory where per-instance ``.env`` should be stored.
61
+ """
62
+ self.handler = handler
63
+ self._robot = robot
64
+ self._stop_event = asyncio.Event()
65
+ self._tasks: List[asyncio.Task[None]] = []
66
+ # Allow the handler to flush the player queue when appropriate.
67
+ self.handler._clear_queue = self.clear_audio_queue
68
+ self._settings_app: Optional[FastAPI] = settings_app
69
+ self._instance_path: Optional[str] = instance_path
70
+ self._settings_initialized = False
71
+ self._asyncio_loop = None
72
+
73
+ # ---- Personality persistence helpers ----
74
+
75
+ def _read_env_lines(self, env_path: Path) -> list[str]:
76
+ """Load env file contents or a template as a list of lines."""
77
+ inst = env_path.parent
78
+ try:
79
+ if env_path.exists():
80
+ try:
81
+ return env_path.read_text(encoding="utf-8").splitlines()
82
+ except Exception:
83
+ return []
84
+ template_text = None
85
+ ex = inst / ".env.example"
86
+ if ex.exists():
87
+ try:
88
+ template_text = ex.read_text(encoding="utf-8")
89
+ except Exception:
90
+ template_text = None
91
+ if template_text is None:
92
+ try:
93
+ cwd_example = Path.cwd() / ".env.example"
94
+ if cwd_example.exists():
95
+ template_text = cwd_example.read_text(encoding="utf-8")
96
+ except Exception:
97
+ template_text = None
98
+ if template_text is None:
99
+ packaged = Path(__file__).parent / ".env.example"
100
+ if packaged.exists():
101
+ try:
102
+ template_text = packaged.read_text(encoding="utf-8")
103
+ except Exception:
104
+ template_text = None
105
+ return template_text.splitlines() if template_text else []
106
+ except Exception:
107
+ return []
108
+
109
+ def _persist_personality(self, profile: Optional[str]) -> None:
110
+ """Persist the startup personality to the instance .env and config."""
111
+ if LOCKED_PROFILE is not None:
112
+ return
113
+ selection = (profile or "").strip() or None
114
+ try:
115
+ from reachy_mini_conversation_app.config import set_custom_profile
116
+
117
+ set_custom_profile(selection)
118
+ except Exception:
119
+ pass
120
+
121
+ if not self._instance_path:
122
+ return
123
+ try:
124
+ env_path = Path(self._instance_path) / ".env"
125
+ lines = self._read_env_lines(env_path)
126
+ replaced = False
127
+ for i, ln in enumerate(list(lines)):
128
+ if ln.strip().startswith("REACHY_MINI_CUSTOM_PROFILE="):
129
+ if selection:
130
+ lines[i] = f"REACHY_MINI_CUSTOM_PROFILE={selection}"
131
+ else:
132
+ lines.pop(i)
133
+ replaced = True
134
+ break
135
+ if selection and not replaced:
136
+ lines.append(f"REACHY_MINI_CUSTOM_PROFILE={selection}")
137
+ if selection is None and not env_path.exists():
138
+ return
139
+ final_text = "\n".join(lines) + "\n"
140
+ env_path.write_text(final_text, encoding="utf-8")
141
+ logger.info("Persisted startup personality to %s", env_path)
142
+ try:
143
+ from dotenv import load_dotenv
144
+
145
+ load_dotenv(dotenv_path=str(env_path), override=True)
146
+ except Exception:
147
+ pass
148
+ except Exception as e:
149
+ logger.warning("Failed to persist REACHY_MINI_CUSTOM_PROFILE: %s", e)
150
+
151
+ def _read_persisted_personality(self) -> Optional[str]:
152
+ """Read persisted startup personality from instance .env (if any)."""
153
+ if not self._instance_path:
154
+ return None
155
+ env_path = Path(self._instance_path) / ".env"
156
+ try:
157
+ if env_path.exists():
158
+ for ln in env_path.read_text(encoding="utf-8").splitlines():
159
+ if ln.strip().startswith("REACHY_MINI_CUSTOM_PROFILE="):
160
+ _, _, val = ln.partition("=")
161
+ v = val.strip()
162
+ return v or None
163
+ except Exception:
164
+ pass
165
+ return None
166
+
167
+ def _init_settings_ui_if_needed(self) -> None:
168
+ """Attach minimal settings UI to the settings app.
169
+
170
+ Mounts a status page and personality management when a settings_app
171
+ is provided.
172
+ """
173
+ if self._settings_initialized:
174
+ return
175
+ if self._settings_app is None:
176
+ return
177
+
178
+ static_dir = Path(__file__).parent / "static"
179
+ index_file = static_dir / "index.html"
180
+
181
+ if hasattr(self._settings_app, "mount"):
182
+ try:
183
+ # Serve /static/* assets
184
+ self._settings_app.mount("/static", StaticFiles(directory=str(static_dir)), name="static")
185
+ except Exception:
186
+ pass
187
+
188
+ # GET / -> index.html
189
+ @self._settings_app.get("/")
190
+ def _root() -> FileResponse:
191
+ return FileResponse(str(index_file))
192
+
193
+ # GET /favicon.ico -> avoid noisy 404s
194
+ @self._settings_app.get("/favicon.ico")
195
+ def _favicon() -> Response:
196
+ return Response(status_code=204)
197
+
198
+ # GET /status -> Ollama connectivity check
199
+ @self._settings_app.get("/status")
200
+ async def _status() -> JSONResponse:
201
+ ollama_ok = False
202
+ try:
203
+ import httpx
204
+
205
+ async with httpx.AsyncClient(timeout=3.0) as client:
206
+ resp = await client.get(f"{config.OLLAMA_BASE_URL}/api/tags")
207
+ ollama_ok = resp.status_code == 200
208
+ except Exception:
209
+ pass
210
+ return JSONResponse({"ollama_connected": ollama_ok, "model": config.MODEL_NAME})
211
+
212
+ # GET /ready -> whether backend finished loading tools
213
+ @self._settings_app.get("/ready")
214
+ def _ready() -> JSONResponse:
215
+ try:
216
+ mod = sys.modules.get("reachy_mini_conversation_app.tools.core_tools")
217
+ ready = bool(getattr(mod, "_TOOLS_INITIALIZED", False)) if mod else False
218
+ except Exception:
219
+ ready = False
220
+ return JSONResponse({"ready": ready})
221
+
222
+ self._settings_initialized = True
223
+
224
+ def launch(self) -> None:
225
+ """Start the recorder/player and run the async processing loops."""
226
+ self._stop_event.clear()
227
+
228
+ # Try to load an existing instance .env first (covers subsequent runs)
229
+ if self._instance_path:
230
+ try:
231
+ from dotenv import load_dotenv
232
+
233
+ from reachy_mini_conversation_app.config import set_custom_profile
234
+
235
+ env_path = Path(self._instance_path) / ".env"
236
+ if env_path.exists():
237
+ load_dotenv(dotenv_path=str(env_path), override=True)
238
+ if LOCKED_PROFILE is None:
239
+ new_profile = os.getenv("REACHY_MINI_CUSTOM_PROFILE")
240
+ if new_profile is not None:
241
+ try:
242
+ set_custom_profile(new_profile.strip() or None)
243
+ except Exception:
244
+ pass # Best-effort profile update
245
+ except Exception:
246
+ pass # Instance .env loading is optional; continue with defaults
247
+
248
+ # Always expose settings UI if a settings app is available
249
+ self._init_settings_ui_if_needed()
250
+
251
+ # Start media
252
+ self._robot.media.start_recording()
253
+ self._robot.media.start_playing()
254
+ time.sleep(1) # give some time to the pipelines to start
255
+
256
+ async def runner() -> None:
257
+ # Capture loop for cross-thread personality actions
258
+ loop = asyncio.get_running_loop()
259
+ self._asyncio_loop = loop # type: ignore[assignment]
260
+ # Mount personality routes now that loop and handler are available
261
+ try:
262
+ if self._settings_app is not None:
263
+ mount_personality_routes(
264
+ self._settings_app,
265
+ self.handler,
266
+ lambda: self._asyncio_loop,
267
+ persist_personality=self._persist_personality,
268
+ get_persisted_personality=self._read_persisted_personality,
269
+ )
270
+ except Exception:
271
+ pass
272
+ self._tasks = [
273
+ asyncio.create_task(self.handler.start_up(), name="ollama-handler"),
274
+ asyncio.create_task(self.record_loop(), name="stream-record-loop"),
275
+ asyncio.create_task(self.play_loop(), name="stream-play-loop"),
276
+ ]
277
+ try:
278
+ await asyncio.gather(*self._tasks)
279
+ except asyncio.CancelledError:
280
+ logger.info("Tasks cancelled during shutdown")
281
+ finally:
282
+ # Ensure handler connection is closed
283
+ await self.handler.shutdown()
284
+
285
+ asyncio.run(runner())
286
+
287
+ def close(self) -> None:
288
+ """Stop the stream and underlying media pipelines.
289
+
290
+ This method:
291
+ - Stops audio recording and playback first
292
+ - Sets the stop event to signal async loops to terminate
293
+ - Cancels all pending async tasks
294
+ """
295
+ logger.info("Stopping LocalStream...")
296
+
297
+ # Stop media pipelines FIRST before cancelling async tasks
298
+ try:
299
+ self._robot.media.stop_recording()
300
+ except Exception as e:
301
+ logger.debug(f"Error stopping recording (may already be stopped): {e}")
302
+
303
+ try:
304
+ self._robot.media.stop_playing()
305
+ except Exception as e:
306
+ logger.debug(f"Error stopping playback (may already be stopped): {e}")
307
+
308
+ # Now signal async loops to stop
309
+ self._stop_event.set()
310
+
311
+ # Cancel all running tasks
312
+ for task in self._tasks:
313
+ if not task.done():
314
+ task.cancel()
315
+
316
+ def clear_audio_queue(self) -> None:
317
+ """Flush the player's appsrc to drop any queued audio immediately."""
318
+ logger.info("User intervention: flushing player queue")
319
+ if self._robot.media.backend == MediaBackend.GSTREAMER:
320
+ self._robot.media.audio.clear_player()
321
+ elif self._robot.media.backend == MediaBackend.DEFAULT or self._robot.media.backend == MediaBackend.DEFAULT_NO_VIDEO:
322
+ self._robot.media.audio.clear_output_buffer()
323
+ self.handler.output_queue = asyncio.Queue()
324
+
325
+ async def record_loop(self) -> None:
326
+ """Read mic frames from the recorder and forward them to the handler."""
327
+ input_sample_rate = self._robot.media.get_input_audio_samplerate()
328
+ logger.debug(f"Audio recording started at {input_sample_rate} Hz")
329
+
330
+ while not self._stop_event.is_set():
331
+ audio_frame = self._robot.media.get_audio_sample()
332
+ if audio_frame is not None:
333
+ await self.handler.receive((input_sample_rate, audio_frame))
334
+ await asyncio.sleep(0) # avoid busy loop
335
+
336
+ async def play_loop(self) -> None:
337
+ """Fetch outputs from the handler: log text and play audio frames."""
338
+ while not self._stop_event.is_set():
339
+ handler_output = await self.handler.emit()
340
+
341
+ if isinstance(handler_output, AdditionalOutputs):
342
+ for msg in handler_output.args:
343
+ content = msg.get("content", "")
344
+ if isinstance(content, str):
345
+ logger.info(
346
+ "role=%s content=%s",
347
+ msg.get("role"),
348
+ content if len(content) < 500 else content[:500] + "…",
349
+ )
350
+
351
+ elif isinstance(handler_output, tuple):
352
+ input_sample_rate, audio_data = handler_output
353
+ output_sample_rate = self._robot.media.get_output_audio_samplerate()
354
+
355
+ # Reshape if needed
356
+ if audio_data.ndim == 2:
357
+ if audio_data.shape[1] > audio_data.shape[0]:
358
+ audio_data = audio_data.T
359
+ if audio_data.shape[1] > 1:
360
+ audio_data = audio_data[:, 0]
361
+
362
+ # Cast if needed
363
+ audio_frame = audio_to_float32(audio_data)
364
+
365
+ # Resample if needed
366
+ if input_sample_rate != output_sample_rate:
367
+ audio_frame = resample(
368
+ audio_frame,
369
+ int(len(audio_frame) * output_sample_rate / input_sample_rate),
370
+ )
371
+
372
+ self._robot.media.push_audio_sample(audio_frame)
373
+
374
+ else:
375
+ logger.debug("Ignoring output type=%s", type(handler_output).__name__)
376
+
377
+ await asyncio.sleep(0) # yield to event loop
src/reachy_mini_conversation_app/dance_emotion_moves.py ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Dance and emotion moves for the movement queue system.
2
+
3
+ This module implements dance moves and emotions as Move objects that can be queued
4
+ and executed sequentially by the MovementManager.
5
+ """
6
+
7
+ from __future__ import annotations
8
+ import logging
9
+ from typing import Tuple
10
+
11
+ import numpy as np
12
+ from numpy.typing import NDArray
13
+
14
+ from reachy_mini.motion.move import Move
15
+ from reachy_mini.motion.recorded_move import RecordedMoves
16
+ from reachy_mini_dances_library.dance_move import DanceMove
17
+
18
+
19
+ logger = logging.getLogger(__name__)
20
+
21
+
22
+ class DanceQueueMove(Move): # type: ignore
23
+ """Wrapper for dance moves to work with the movement queue system."""
24
+
25
+ def __init__(self, move_name: str):
26
+ """Initialize a DanceQueueMove."""
27
+ self.dance_move = DanceMove(move_name)
28
+ self.move_name = move_name
29
+
30
+ @property
31
+ def duration(self) -> float:
32
+ """Duration property required by official Move interface."""
33
+ return float(self.dance_move.duration)
34
+
35
+ def evaluate(self, t: float) -> tuple[NDArray[np.float64] | None, NDArray[np.float64] | None, float | None]:
36
+ """Evaluate dance move at time t."""
37
+ try:
38
+ # Get the pose from the dance move
39
+ head_pose, antennas, body_yaw = self.dance_move.evaluate(t)
40
+
41
+ # Convert to numpy array if antennas is tuple and return in official Move format
42
+ if isinstance(antennas, tuple):
43
+ antennas = np.array([antennas[0], antennas[1]])
44
+
45
+ return (head_pose, antennas, body_yaw)
46
+
47
+ except Exception as e:
48
+ logger.error(f"Error evaluating dance move '{self.move_name}' at t={t}: {e}")
49
+ # Return neutral pose on error
50
+ from reachy_mini.utils import create_head_pose
51
+
52
+ neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
53
+ return (neutral_head_pose, np.array([0.0, 0.0], dtype=np.float64), 0.0)
54
+
55
+
56
+ class EmotionQueueMove(Move): # type: ignore
57
+ """Wrapper for emotion moves to work with the movement queue system."""
58
+
59
+ def __init__(self, emotion_name: str, recorded_moves: RecordedMoves):
60
+ """Initialize an EmotionQueueMove."""
61
+ self.emotion_move = recorded_moves.get(emotion_name)
62
+ self.emotion_name = emotion_name
63
+
64
+ @property
65
+ def duration(self) -> float:
66
+ """Duration property required by official Move interface."""
67
+ return float(self.emotion_move.duration)
68
+
69
+ def evaluate(self, t: float) -> tuple[NDArray[np.float64] | None, NDArray[np.float64] | None, float | None]:
70
+ """Evaluate emotion move at time t."""
71
+ try:
72
+ # Get the pose from the emotion move
73
+ head_pose, antennas, body_yaw = self.emotion_move.evaluate(t)
74
+
75
+ # Convert to numpy array if antennas is tuple and return in official Move format
76
+ if isinstance(antennas, tuple):
77
+ antennas = np.array([antennas[0], antennas[1]])
78
+
79
+ return (head_pose, antennas, body_yaw)
80
+
81
+ except Exception as e:
82
+ logger.error(f"Error evaluating emotion '{self.emotion_name}' at t={t}: {e}")
83
+ # Return neutral pose on error
84
+ from reachy_mini.utils import create_head_pose
85
+
86
+ neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
87
+ return (neutral_head_pose, np.array([0.0, 0.0], dtype=np.float64), 0.0)
88
+
89
+
90
+ class GotoQueueMove(Move): # type: ignore
91
+ """Wrapper for goto moves to work with the movement queue system."""
92
+
93
+ def __init__(
94
+ self,
95
+ target_head_pose: NDArray[np.float32],
96
+ start_head_pose: NDArray[np.float32] | None = None,
97
+ target_antennas: Tuple[float, float] = (0, 0),
98
+ start_antennas: Tuple[float, float] | None = None,
99
+ target_body_yaw: float = 0,
100
+ start_body_yaw: float | None = None,
101
+ duration: float = 1.0,
102
+ ):
103
+ """Initialize a GotoQueueMove."""
104
+ self._duration = duration
105
+ self.target_head_pose = target_head_pose
106
+ self.start_head_pose = start_head_pose
107
+ self.target_antennas = target_antennas
108
+ self.start_antennas = start_antennas or (0, 0)
109
+ self.target_body_yaw = target_body_yaw
110
+ self.start_body_yaw = start_body_yaw or 0
111
+
112
+ @property
113
+ def duration(self) -> float:
114
+ """Duration property required by official Move interface."""
115
+ return self._duration
116
+
117
+ def evaluate(self, t: float) -> tuple[NDArray[np.float64] | None, NDArray[np.float64] | None, float | None]:
118
+ """Evaluate goto move at time t using linear interpolation."""
119
+ try:
120
+ from reachy_mini.utils import create_head_pose
121
+ from reachy_mini.utils.interpolation import linear_pose_interpolation
122
+
123
+ # Clamp t to [0, 1] for interpolation
124
+ t_clamped = max(0, min(1, t / self.duration))
125
+
126
+ # Use start pose if available, otherwise neutral
127
+ if self.start_head_pose is not None:
128
+ start_pose = self.start_head_pose
129
+ else:
130
+ start_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
131
+
132
+ # Interpolate head pose
133
+ head_pose = linear_pose_interpolation(start_pose, self.target_head_pose, t_clamped)
134
+
135
+ # Interpolate antennas - return as numpy array
136
+ antennas = np.array(
137
+ [
138
+ self.start_antennas[0] + (self.target_antennas[0] - self.start_antennas[0]) * t_clamped,
139
+ self.start_antennas[1] + (self.target_antennas[1] - self.start_antennas[1]) * t_clamped,
140
+ ],
141
+ dtype=np.float64,
142
+ )
143
+
144
+ # Interpolate body yaw
145
+ body_yaw = self.start_body_yaw + (self.target_body_yaw - self.start_body_yaw) * t_clamped
146
+
147
+ return (head_pose, antennas, body_yaw)
148
+
149
+ except Exception as e:
150
+ logger.error(f"Error evaluating goto move at t={t}: {e}")
151
+ # Return target pose on error - convert to float64
152
+ target_head_pose_f64 = self.target_head_pose.astype(np.float64)
153
+ target_antennas_array = np.array([self.target_antennas[0], self.target_antennas[1]], dtype=np.float64)
154
+ return (target_head_pose_f64, target_antennas_array, self.target_body_yaw)
src/reachy_mini_conversation_app/gradio_personality.py ADDED
@@ -0,0 +1,316 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Gradio personality UI components and wiring.
2
+
3
+ This module encapsulates the UI elements and logic related to managing
4
+ conversation "personalities" (profiles) so that `main.py` stays lean.
5
+ """
6
+
7
+ from __future__ import annotations
8
+ from typing import Any
9
+ from pathlib import Path
10
+
11
+ import gradio as gr
12
+
13
+ from .config import LOCKED_PROFILE, config
14
+
15
+
16
+ class PersonalityUI:
17
+ """Container for personality-related Gradio components."""
18
+
19
+ def __init__(self) -> None:
20
+ """Initialize the PersonalityUI instance."""
21
+ # Constants and paths
22
+ self.DEFAULT_OPTION = "(built-in default)"
23
+ self._profiles_root = Path(__file__).parent / "profiles"
24
+ self._tools_dir = Path(__file__).parent / "tools"
25
+ self._prompts_dir = Path(__file__).parent / "prompts"
26
+
27
+ # Components (initialized in create_components)
28
+ self.personalities_dropdown: gr.Dropdown
29
+ self.apply_btn: gr.Button
30
+ self.status_md: gr.Markdown
31
+ self.preview_md: gr.Markdown
32
+ self.person_name_tb: gr.Textbox
33
+ self.person_instr_ta: gr.TextArea
34
+ self.tools_txt_ta: gr.TextArea
35
+ self.voice_dropdown: gr.Dropdown
36
+ self.new_personality_btn: gr.Button
37
+ self.available_tools_cg: gr.CheckboxGroup
38
+ self.save_btn: gr.Button
39
+
40
+ # ---------- Filesystem helpers ----------
41
+ def _list_personalities(self) -> list[str]:
42
+ names: list[str] = []
43
+ try:
44
+ if self._profiles_root.exists():
45
+ for p in sorted(self._profiles_root.iterdir()):
46
+ if p.name == "user_personalities":
47
+ continue
48
+ if p.is_dir() and (p / "instructions.txt").exists():
49
+ names.append(p.name)
50
+ user_dir = self._profiles_root / "user_personalities"
51
+ if user_dir.exists():
52
+ for p in sorted(user_dir.iterdir()):
53
+ if p.is_dir() and (p / "instructions.txt").exists():
54
+ names.append(f"user_personalities/{p.name}")
55
+ except Exception:
56
+ pass
57
+ return names
58
+
59
+ def _resolve_profile_dir(self, selection: str) -> Path:
60
+ return self._profiles_root / selection
61
+
62
+ def _read_instructions_for(self, name: str) -> str:
63
+ try:
64
+ if name == self.DEFAULT_OPTION:
65
+ default_file = self._prompts_dir / "default_prompt.txt"
66
+ if default_file.exists():
67
+ return default_file.read_text(encoding="utf-8").strip()
68
+ return ""
69
+ target = self._resolve_profile_dir(name) / "instructions.txt"
70
+ if target.exists():
71
+ return target.read_text(encoding="utf-8").strip()
72
+ return ""
73
+ except Exception as e:
74
+ return f"Could not load instructions: {e}"
75
+
76
+ @staticmethod
77
+ def _sanitize_name(name: str) -> str:
78
+ import re
79
+
80
+ s = name.strip()
81
+ s = re.sub(r"\s+", "_", s)
82
+ s = re.sub(r"[^a-zA-Z0-9_-]", "", s)
83
+ return s
84
+
85
+ # ---------- Public API ----------
86
+ def create_components(self) -> None:
87
+ """Instantiate Gradio components for the personality UI."""
88
+ if LOCKED_PROFILE is not None:
89
+ is_locked = True
90
+ current_value: str = LOCKED_PROFILE
91
+ dropdown_label = "Select personality (locked)"
92
+ dropdown_choices: list[str] = [LOCKED_PROFILE]
93
+ else:
94
+ is_locked = False
95
+ current_value = config.REACHY_MINI_CUSTOM_PROFILE or self.DEFAULT_OPTION
96
+ dropdown_label = "Select personality"
97
+ dropdown_choices = [self.DEFAULT_OPTION, *(self._list_personalities())]
98
+
99
+ self.personalities_dropdown = gr.Dropdown(
100
+ label=dropdown_label,
101
+ choices=dropdown_choices,
102
+ value=current_value,
103
+ interactive=not is_locked,
104
+ )
105
+ self.apply_btn = gr.Button("Apply personality", interactive=not is_locked)
106
+ self.status_md = gr.Markdown(visible=True)
107
+ self.preview_md = gr.Markdown(value=self._read_instructions_for(current_value))
108
+ self.person_name_tb = gr.Textbox(label="Personality name", interactive=not is_locked)
109
+ self.person_instr_ta = gr.TextArea(label="Personality instructions", lines=10, interactive=not is_locked)
110
+ self.tools_txt_ta = gr.TextArea(label="tools.txt", lines=10, interactive=not is_locked)
111
+ self.voice_dropdown = gr.Dropdown(label="Voice", choices=["cedar"], value="cedar", interactive=not is_locked)
112
+ self.new_personality_btn = gr.Button("New personality", interactive=not is_locked)
113
+ self.available_tools_cg = gr.CheckboxGroup(label="Available tools (helper)", choices=[], value=[], interactive=not is_locked)
114
+ self.save_btn = gr.Button("Save personality (instructions + tools)", interactive=not is_locked)
115
+
116
+ def additional_inputs_ordered(self) -> list[Any]:
117
+ """Return the additional inputs in the expected order for Stream."""
118
+ return [
119
+ self.personalities_dropdown,
120
+ self.apply_btn,
121
+ self.new_personality_btn,
122
+ self.status_md,
123
+ self.preview_md,
124
+ self.person_name_tb,
125
+ self.person_instr_ta,
126
+ self.tools_txt_ta,
127
+ self.voice_dropdown,
128
+ self.available_tools_cg,
129
+ self.save_btn,
130
+ ]
131
+
132
+ # ---------- Event wiring ----------
133
+ def wire_events(self, handler: Any, blocks: gr.Blocks) -> None:
134
+ """Attach event handlers to components within a Blocks context."""
135
+
136
+ async def _apply_personality(selected: str) -> tuple[str, str]:
137
+ if LOCKED_PROFILE is not None and selected != LOCKED_PROFILE:
138
+ return (
139
+ f"Profile is locked to '{LOCKED_PROFILE}'. Cannot change personality.",
140
+ self._read_instructions_for(LOCKED_PROFILE),
141
+ )
142
+ profile = None if selected == self.DEFAULT_OPTION else selected
143
+ status = await handler.apply_personality(profile)
144
+ preview = self._read_instructions_for(selected)
145
+ return status, preview
146
+
147
+ def _read_voice_for(name: str) -> str:
148
+ try:
149
+ if name == self.DEFAULT_OPTION:
150
+ return "cedar"
151
+ vf = self._resolve_profile_dir(name) / "voice.txt"
152
+ if vf.exists():
153
+ v = vf.read_text(encoding="utf-8").strip()
154
+ return v or "cedar"
155
+ except Exception:
156
+ pass
157
+ return "cedar"
158
+
159
+ async def _fetch_voices(selected: str) -> dict[str, Any]:
160
+ try:
161
+ voices = await handler.get_available_voices()
162
+ current = _read_voice_for(selected)
163
+ if current not in voices:
164
+ current = "cedar"
165
+ return gr.update(choices=voices, value=current)
166
+ except Exception:
167
+ return gr.update(choices=["cedar"], value="cedar")
168
+
169
+ def _available_tools_for(selected: str) -> tuple[list[str], list[str]]:
170
+ shared: list[str] = []
171
+ try:
172
+ for py in self._tools_dir.glob("*.py"):
173
+ if py.stem in {"__init__", "core_tools"}:
174
+ continue
175
+ shared.append(py.stem)
176
+ except Exception:
177
+ pass
178
+ local: list[str] = []
179
+ try:
180
+ if selected != self.DEFAULT_OPTION:
181
+ for py in (self._profiles_root / selected).glob("*.py"):
182
+ local.append(py.stem)
183
+ except Exception:
184
+ pass
185
+ return sorted(shared), sorted(local)
186
+
187
+ def _parse_enabled_tools(text: str) -> list[str]:
188
+ enabled: list[str] = []
189
+ for line in text.splitlines():
190
+ s = line.strip()
191
+ if not s or s.startswith("#"):
192
+ continue
193
+ enabled.append(s)
194
+ return enabled
195
+
196
+ def _load_profile_for_edit(selected: str) -> tuple[dict[str, Any], dict[str, Any], dict[str, Any], str]:
197
+ instr = self._read_instructions_for(selected)
198
+ tools_txt = ""
199
+ if selected != self.DEFAULT_OPTION:
200
+ tp = self._resolve_profile_dir(selected) / "tools.txt"
201
+ if tp.exists():
202
+ tools_txt = tp.read_text(encoding="utf-8")
203
+ shared, local = _available_tools_for(selected)
204
+ all_tools = sorted(set(shared + local))
205
+ enabled = _parse_enabled_tools(tools_txt)
206
+ status_text = f"Loaded profile '{selected}'."
207
+ return (
208
+ gr.update(value=instr),
209
+ gr.update(value=tools_txt),
210
+ gr.update(choices=all_tools, value=enabled),
211
+ status_text,
212
+ )
213
+
214
+ def _new_personality() -> tuple[
215
+ dict[str, Any], dict[str, Any], dict[str, Any], dict[str, Any], str, dict[str, Any]
216
+ ]:
217
+ try:
218
+ # Prefill with hints
219
+ instr_val = """# Write your instructions here\n# e.g., Keep responses concise and friendly."""
220
+ tools_txt_val = "# tools enabled for this profile\n"
221
+ return (
222
+ gr.update(value=""),
223
+ gr.update(value=instr_val),
224
+ gr.update(value=tools_txt_val),
225
+ gr.update(choices=sorted(_available_tools_for(self.DEFAULT_OPTION)[0]), value=[]),
226
+ "Fill in a name, instructions and (optional) tools, then Save.",
227
+ gr.update(value="cedar"),
228
+ )
229
+ except Exception:
230
+ return (
231
+ gr.update(),
232
+ gr.update(),
233
+ gr.update(),
234
+ gr.update(),
235
+ "Failed to initialize new personality.",
236
+ gr.update(),
237
+ )
238
+
239
+ def _save_personality(
240
+ name: str, instructions: str, tools_text: str, voice: str
241
+ ) -> tuple[dict[str, Any], dict[str, Any], str]:
242
+ name_s = self._sanitize_name(name)
243
+ if not name_s:
244
+ return gr.update(), gr.update(), "Please enter a valid name."
245
+ try:
246
+ target_dir = self._profiles_root / "user_personalities" / name_s
247
+ target_dir.mkdir(parents=True, exist_ok=True)
248
+ (target_dir / "instructions.txt").write_text(instructions.strip() + "\n", encoding="utf-8")
249
+ (target_dir / "tools.txt").write_text(tools_text.strip() + "\n", encoding="utf-8")
250
+ (target_dir / "voice.txt").write_text((voice or "cedar").strip() + "\n", encoding="utf-8")
251
+
252
+ choices = self._list_personalities()
253
+ value = f"user_personalities/{name_s}"
254
+ if value not in choices:
255
+ choices.append(value)
256
+ return (
257
+ gr.update(choices=[self.DEFAULT_OPTION, *sorted(choices)], value=value),
258
+ gr.update(value=instructions),
259
+ f"Saved personality '{name_s}'.",
260
+ )
261
+ except Exception as e:
262
+ return gr.update(), gr.update(), f"Failed to save personality: {e}"
263
+
264
+ def _sync_tools_from_checks(selected: list[str], current_text: str) -> dict[str, Any]:
265
+ comments = [ln for ln in current_text.splitlines() if ln.strip().startswith("#")]
266
+ body = "\n".join(selected)
267
+ out = ("\n".join(comments) + ("\n" if comments else "") + body).strip() + "\n"
268
+ return gr.update(value=out)
269
+
270
+ with blocks:
271
+ self.apply_btn.click(
272
+ fn=_apply_personality,
273
+ inputs=[self.personalities_dropdown],
274
+ outputs=[self.status_md, self.preview_md],
275
+ )
276
+
277
+ self.personalities_dropdown.change(
278
+ fn=_load_profile_for_edit,
279
+ inputs=[self.personalities_dropdown],
280
+ outputs=[self.person_instr_ta, self.tools_txt_ta, self.available_tools_cg, self.status_md],
281
+ )
282
+
283
+ blocks.load(
284
+ fn=_fetch_voices,
285
+ inputs=[self.personalities_dropdown],
286
+ outputs=[self.voice_dropdown],
287
+ )
288
+
289
+ self.available_tools_cg.change(
290
+ fn=_sync_tools_from_checks,
291
+ inputs=[self.available_tools_cg, self.tools_txt_ta],
292
+ outputs=[self.tools_txt_ta],
293
+ )
294
+
295
+ self.new_personality_btn.click(
296
+ fn=_new_personality,
297
+ inputs=[],
298
+ outputs=[
299
+ self.person_name_tb,
300
+ self.person_instr_ta,
301
+ self.tools_txt_ta,
302
+ self.available_tools_cg,
303
+ self.status_md,
304
+ self.voice_dropdown,
305
+ ],
306
+ )
307
+
308
+ self.save_btn.click(
309
+ fn=_save_personality,
310
+ inputs=[self.person_name_tb, self.person_instr_ta, self.tools_txt_ta, self.voice_dropdown],
311
+ outputs=[self.personalities_dropdown, self.person_instr_ta, self.status_md],
312
+ ).then(
313
+ fn=_apply_personality,
314
+ inputs=[self.personalities_dropdown],
315
+ outputs=[self.status_md, self.preview_md],
316
+ )
src/reachy_mini_conversation_app/headless_personality.py ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Headless personality management (console-based).
2
+
3
+ Provides an interactive CLI to browse, preview, apply, create and edit
4
+ "personalities" (profiles) when running without Gradio.
5
+
6
+ This module is intentionally not shared with the Gradio implementation to
7
+ avoid coupling and keep responsibilities clear for headless mode.
8
+ """
9
+
10
+ from __future__ import annotations
11
+ from typing import List
12
+ from pathlib import Path
13
+
14
+
15
+ DEFAULT_OPTION = "(built-in default)"
16
+
17
+
18
+ def _profiles_root() -> Path:
19
+ return Path(__file__).parent / "profiles"
20
+
21
+
22
+ def _prompts_dir() -> Path:
23
+ return Path(__file__).parent / "prompts"
24
+
25
+
26
+ def _tools_dir() -> Path:
27
+ return Path(__file__).parent / "tools"
28
+
29
+
30
+ def _sanitize_name(name: str) -> str:
31
+ import re
32
+
33
+ s = name.strip()
34
+ s = re.sub(r"\s+", "_", s)
35
+ s = re.sub(r"[^a-zA-Z0-9_-]", "", s)
36
+ return s
37
+
38
+
39
+ def list_personalities() -> List[str]:
40
+ """List available personality profile names."""
41
+ names: List[str] = []
42
+ root = _profiles_root()
43
+ try:
44
+ if root.exists():
45
+ for p in sorted(root.iterdir()):
46
+ if p.name == "user_personalities":
47
+ continue
48
+ if p.is_dir() and (p / "instructions.txt").exists():
49
+ names.append(p.name)
50
+ udir = root / "user_personalities"
51
+ if udir.exists():
52
+ for p in sorted(udir.iterdir()):
53
+ if p.is_dir() and (p / "instructions.txt").exists():
54
+ names.append(f"user_personalities/{p.name}")
55
+ except Exception:
56
+ pass
57
+ return names
58
+
59
+
60
+ def resolve_profile_dir(selection: str) -> Path:
61
+ """Resolve the directory path for the given profile selection."""
62
+ return _profiles_root() / selection
63
+
64
+
65
+ def read_instructions_for(name: str) -> str:
66
+ """Read the instructions.txt content for the given profile name."""
67
+ try:
68
+ if name == DEFAULT_OPTION:
69
+ df = _prompts_dir() / "default_prompt.txt"
70
+ return df.read_text(encoding="utf-8").strip() if df.exists() else ""
71
+ target = resolve_profile_dir(name) / "instructions.txt"
72
+ return target.read_text(encoding="utf-8").strip() if target.exists() else ""
73
+ except Exception as e:
74
+ return f"Could not load instructions: {e}"
75
+
76
+
77
+ def available_tools_for(selected: str) -> List[str]:
78
+ """List available tool modules for the given profile selection."""
79
+ shared: List[str] = []
80
+ try:
81
+ for py in _tools_dir().glob("*.py"):
82
+ if py.stem in {"__init__", "core_tools"}:
83
+ continue
84
+ shared.append(py.stem)
85
+ except Exception:
86
+ pass
87
+ local: List[str] = []
88
+ try:
89
+ if selected != DEFAULT_OPTION:
90
+ for py in resolve_profile_dir(selected).glob("*.py"):
91
+ local.append(py.stem)
92
+ except Exception:
93
+ pass
94
+ return sorted(set(shared + local))
95
+
96
+
97
+ def _write_profile(name_s: str, instructions: str, tools_text: str, voice: str = "cedar") -> None:
98
+ target_dir = _profiles_root() / "user_personalities" / name_s
99
+ target_dir.mkdir(parents=True, exist_ok=True)
100
+ (target_dir / "instructions.txt").write_text(instructions.strip() + "\n", encoding="utf-8")
101
+ (target_dir / "tools.txt").write_text((tools_text or "").strip() + "\n", encoding="utf-8")
102
+ (target_dir / "voice.txt").write_text((voice or "cedar").strip() + "\n", encoding="utf-8")
src/reachy_mini_conversation_app/headless_personality_ui.py ADDED
@@ -0,0 +1,287 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Settings UI routes for headless personality management.
2
+
3
+ Exposes REST endpoints on the provided FastAPI settings app. The
4
+ implementation schedules backend actions (apply personality, fetch voices)
5
+ onto the running LocalStream asyncio loop using the supplied get_loop
6
+ callable to avoid cross-thread issues.
7
+ """
8
+
9
+ from __future__ import annotations
10
+ import asyncio
11
+ import logging
12
+ from typing import Any, Callable, Optional
13
+
14
+ from fastapi import FastAPI
15
+
16
+ from .config import LOCKED_PROFILE, config
17
+ from .ollama_handler import OllamaHandler
18
+ from .headless_personality import (
19
+ DEFAULT_OPTION,
20
+ _sanitize_name,
21
+ _write_profile,
22
+ list_personalities,
23
+ available_tools_for,
24
+ resolve_profile_dir,
25
+ read_instructions_for,
26
+ )
27
+
28
+
29
+ def mount_personality_routes(
30
+ app: FastAPI,
31
+ handler: OllamaHandler,
32
+ get_loop: Callable[[], asyncio.AbstractEventLoop | None],
33
+ *,
34
+ persist_personality: Callable[[Optional[str]], None] | None = None,
35
+ get_persisted_personality: Callable[[], Optional[str]] | None = None,
36
+ ) -> None:
37
+ """Register personality management endpoints on a FastAPI app."""
38
+ try:
39
+ from fastapi import Request
40
+ from pydantic import BaseModel
41
+ from fastapi.responses import JSONResponse
42
+ except Exception: # pragma: no cover - only when settings app not available
43
+ return
44
+
45
+ class SavePayload(BaseModel):
46
+ name: str
47
+ instructions: str
48
+ tools_text: str
49
+ voice: Optional[str] = "cedar"
50
+
51
+ class ApplyPayload(BaseModel):
52
+ name: str
53
+ persist: Optional[bool] = False
54
+
55
+ def _startup_choice() -> Any:
56
+ """Return the persisted startup personality or default."""
57
+ try:
58
+ if get_persisted_personality is not None:
59
+ stored = get_persisted_personality()
60
+ if stored:
61
+ return stored
62
+ env_val = getattr(config, "REACHY_MINI_CUSTOM_PROFILE", None)
63
+ if env_val:
64
+ return env_val
65
+ except Exception:
66
+ pass
67
+ return DEFAULT_OPTION
68
+
69
+ def _current_choice() -> str:
70
+ try:
71
+ cur = getattr(config, "REACHY_MINI_CUSTOM_PROFILE", None)
72
+ return cur or DEFAULT_OPTION
73
+ except Exception:
74
+ return DEFAULT_OPTION
75
+
76
+ @app.get("/personalities")
77
+ def _list() -> dict: # type: ignore
78
+ choices = [DEFAULT_OPTION, *list_personalities()]
79
+ return {
80
+ "choices": choices,
81
+ "current": _current_choice(),
82
+ "startup": _startup_choice(),
83
+ "locked": LOCKED_PROFILE is not None,
84
+ "locked_to": LOCKED_PROFILE,
85
+ }
86
+
87
+ @app.get("/personalities/load")
88
+ def _load(name: str) -> dict: # type: ignore
89
+ instr = read_instructions_for(name)
90
+ tools_txt = ""
91
+ voice = "cedar"
92
+ if name != DEFAULT_OPTION:
93
+ pdir = resolve_profile_dir(name)
94
+ tp = pdir / "tools.txt"
95
+ if tp.exists():
96
+ tools_txt = tp.read_text(encoding="utf-8")
97
+ vf = pdir / "voice.txt"
98
+ if vf.exists():
99
+ v = vf.read_text(encoding="utf-8").strip()
100
+ voice = v or "cedar"
101
+ avail = available_tools_for(name)
102
+ enabled = [ln.strip() for ln in tools_txt.splitlines() if ln.strip() and not ln.strip().startswith("#")]
103
+ return {
104
+ "instructions": instr,
105
+ "tools_text": tools_txt,
106
+ "voice": voice,
107
+ "available_tools": avail,
108
+ "enabled_tools": enabled,
109
+ }
110
+
111
+ @app.post("/personalities/save")
112
+ async def _save(request: Request) -> dict: # type: ignore
113
+ # Accept raw JSON only to avoid validation-related 422s
114
+ try:
115
+ raw = await request.json()
116
+ except Exception:
117
+ raw = {}
118
+ name = str(raw.get("name", ""))
119
+ instructions = str(raw.get("instructions", ""))
120
+ tools_text = str(raw.get("tools_text", ""))
121
+ voice = str(raw.get("voice", "cedar")) if raw.get("voice") is not None else "cedar"
122
+
123
+ name_s = _sanitize_name(name)
124
+ if not name_s:
125
+ return JSONResponse({"ok": False, "error": "invalid_name"}, status_code=400) # type: ignore
126
+ try:
127
+ logger.info(
128
+ "Headless save: name=%r voice=%r instr_len=%d tools_len=%d",
129
+ name_s,
130
+ voice,
131
+ len(instructions),
132
+ len(tools_text),
133
+ )
134
+ _write_profile(name_s, instructions, tools_text, voice or "cedar")
135
+ value = f"user_personalities/{name_s}"
136
+ choices = [DEFAULT_OPTION, *list_personalities()]
137
+ return {"ok": True, "value": value, "choices": choices}
138
+ except Exception as e:
139
+ return JSONResponse({"ok": False, "error": str(e)}, status_code=500) # type: ignore
140
+
141
+ @app.post("/personalities/save_raw")
142
+ async def _save_raw(
143
+ request: Request,
144
+ name: Optional[str] = None,
145
+ instructions: Optional[str] = None,
146
+ tools_text: Optional[str] = None,
147
+ voice: Optional[str] = None,
148
+ ) -> dict: # type: ignore
149
+ # Accept query params, form-encoded, or raw JSON
150
+ data = {"name": name, "instructions": instructions, "tools_text": tools_text, "voice": voice}
151
+ # Prefer form if present
152
+ try:
153
+ form = await request.form()
154
+ for k in ("name", "instructions", "tools_text", "voice"):
155
+ if k in form and form[k] is not None:
156
+ data[k] = str(form[k])
157
+ except Exception:
158
+ pass
159
+ # Try JSON
160
+ try:
161
+ raw = await request.json()
162
+ if isinstance(raw, dict):
163
+ for k in ("name", "instructions", "tools_text", "voice"):
164
+ if raw.get(k) is not None:
165
+ data[k] = str(raw.get(k))
166
+ except Exception:
167
+ pass
168
+
169
+ name_s = _sanitize_name(str(data.get("name") or ""))
170
+ if not name_s:
171
+ return JSONResponse({"ok": False, "error": "invalid_name"}, status_code=400) # type: ignore
172
+ instr = str(data.get("instructions") or "")
173
+ tools = str(data.get("tools_text") or "")
174
+ v = str(data.get("voice") or "cedar")
175
+ try:
176
+ logger.info(
177
+ "Headless save_raw: name=%r voice=%r instr_len=%d tools_len=%d", name_s, v, len(instr), len(tools)
178
+ )
179
+ _write_profile(name_s, instr, tools, v)
180
+ value = f"user_personalities/{name_s}"
181
+ choices = [DEFAULT_OPTION, *list_personalities()]
182
+ return {"ok": True, "value": value, "choices": choices}
183
+ except Exception as e:
184
+ return JSONResponse({"ok": False, "error": str(e)}, status_code=500) # type: ignore
185
+
186
+ @app.get("/personalities/save_raw")
187
+ async def _save_raw_get(name: str, instructions: str = "", tools_text: str = "", voice: str = "cedar") -> dict: # type: ignore
188
+ name_s = _sanitize_name(name)
189
+ if not name_s:
190
+ return JSONResponse({"ok": False, "error": "invalid_name"}, status_code=400) # type: ignore
191
+ try:
192
+ logger.info(
193
+ "Headless save_raw(GET): name=%r voice=%r instr_len=%d tools_len=%d",
194
+ name_s,
195
+ voice,
196
+ len(instructions),
197
+ len(tools_text),
198
+ )
199
+ _write_profile(name_s, instructions, tools_text, voice or "cedar")
200
+ value = f"user_personalities/{name_s}"
201
+ choices = [DEFAULT_OPTION, *list_personalities()]
202
+ return {"ok": True, "value": value, "choices": choices}
203
+ except Exception as e:
204
+ return JSONResponse({"ok": False, "error": str(e)}, status_code=500) # type: ignore
205
+
206
+ logger = logging.getLogger(__name__)
207
+
208
+ @app.post("/personalities/apply")
209
+ async def _apply(
210
+ payload: ApplyPayload | None = None,
211
+ name: str | None = None,
212
+ persist: Optional[bool] = None,
213
+ request: Optional[Request] = None,
214
+ ) -> dict: # type: ignore
215
+ if LOCKED_PROFILE is not None:
216
+ return JSONResponse(
217
+ {"ok": False, "error": "profile_locked", "locked_to": LOCKED_PROFILE},
218
+ status_code=403,
219
+ ) # type: ignore
220
+ loop = get_loop()
221
+ if loop is None:
222
+ return JSONResponse({"ok": False, "error": "loop_unavailable"}, status_code=503) # type: ignore
223
+
224
+ # Accept both JSON payload and query param for convenience
225
+ sel_name: Optional[str] = None
226
+ persist_flag = bool(persist) if persist is not None else False
227
+ if payload and getattr(payload, "name", None):
228
+ sel_name = payload.name
229
+ persist_flag = bool(getattr(payload, "persist", False))
230
+ elif name:
231
+ sel_name = name
232
+ elif request is not None:
233
+ try:
234
+ body = await request.json()
235
+ if isinstance(body, dict) and body.get("name"):
236
+ sel_name = str(body.get("name"))
237
+ if isinstance(body, dict) and "persist" in body:
238
+ persist_flag = bool(body.get("persist"))
239
+ except Exception:
240
+ sel_name = None
241
+ if request is not None:
242
+ try:
243
+ q_persist = request.query_params.get("persist")
244
+ if q_persist is not None:
245
+ persist_flag = str(q_persist).lower() in {"1", "true", "yes", "on"}
246
+ except Exception:
247
+ pass
248
+ if not sel_name:
249
+ sel_name = DEFAULT_OPTION
250
+
251
+ async def _do_apply() -> str:
252
+ sel = None if sel_name == DEFAULT_OPTION else sel_name
253
+ status = await handler.apply_personality(sel)
254
+ return status
255
+
256
+ try:
257
+ logger.info("Headless apply: requested name=%r", sel_name)
258
+ fut = asyncio.run_coroutine_threadsafe(_do_apply(), loop)
259
+ status = fut.result(timeout=10)
260
+ persisted_choice = _startup_choice()
261
+ if persist_flag and persist_personality is not None:
262
+ try:
263
+ persist_personality(None if sel_name == DEFAULT_OPTION else sel_name)
264
+ persisted_choice = _startup_choice()
265
+ except Exception as e:
266
+ logger.warning("Failed to persist startup personality: %s", e)
267
+ return {"ok": True, "status": status, "startup": persisted_choice}
268
+ except Exception as e:
269
+ return JSONResponse({"ok": False, "error": str(e)}, status_code=500) # type: ignore
270
+
271
+ @app.get("/voices")
272
+ async def _voices() -> list[str]:
273
+ loop = get_loop()
274
+ if loop is None:
275
+ return ["cedar"]
276
+
277
+ async def _get_v() -> list[str]:
278
+ try:
279
+ return await handler.get_available_voices()
280
+ except Exception:
281
+ return ["cedar"]
282
+
283
+ try:
284
+ fut = asyncio.run_coroutine_threadsafe(_get_v(), loop)
285
+ return fut.result(timeout=10)
286
+ except Exception:
287
+ return ["cedar"]
src/reachy_mini_conversation_app/images/reachymini_avatar.png ADDED

Git LFS Details

  • SHA256: 5a63ac8802ff3542f01292c431c5278296880d74cd3580d219fcf4827bc235f9
  • Pointer size: 132 Bytes
  • Size of remote file: 1.23 MB
src/reachy_mini_conversation_app/images/user_avatar.png ADDED

Git LFS Details

  • SHA256: e97ca125a86bacdaa41c8dca88abd9ca746fd5c9391eda24249c012432b0219b
  • Pointer size: 132 Bytes
  • Size of remote file: 1.11 MB
src/reachy_mini_conversation_app/main.py ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Entrypoint for the Reachy Mini conversation app."""
2
+
3
+ import os
4
+ import sys
5
+ import time
6
+ import asyncio
7
+ import argparse
8
+ import threading
9
+ from typing import Any, Dict, List, Optional
10
+
11
+ import gradio as gr
12
+ from fastapi import FastAPI
13
+ from fastrtc import Stream
14
+ from gradio.utils import get_space
15
+
16
+ from reachy_mini import ReachyMini, ReachyMiniApp
17
+ from reachy_mini_conversation_app.utils import (
18
+ parse_args,
19
+ setup_logger,
20
+ handle_vision_stuff,
21
+ log_connection_troubleshooting,
22
+ )
23
+
24
+
25
+ def update_chatbot(chatbot: List[Dict[str, Any]], response: Dict[str, Any]) -> List[Dict[str, Any]]:
26
+ """Update the chatbot with AdditionalOutputs."""
27
+ chatbot.append(response)
28
+ return chatbot
29
+
30
+
31
+ def main() -> None:
32
+ """Entrypoint for the Reachy Mini conversation app."""
33
+ args, _ = parse_args()
34
+ run(args)
35
+
36
+
37
+ def run(
38
+ args: argparse.Namespace,
39
+ robot: ReachyMini = None,
40
+ app_stop_event: Optional[threading.Event] = None,
41
+ settings_app: Optional[FastAPI] = None,
42
+ instance_path: Optional[str] = None,
43
+ ) -> None:
44
+ """Run the Reachy Mini conversation app."""
45
+ # Putting these dependencies here makes the dashboard faster to load when the conversation app is installed
46
+ from reachy_mini_conversation_app.moves import MovementManager
47
+ from reachy_mini_conversation_app.console import LocalStream
48
+ from reachy_mini_conversation_app.ollama_handler import OllamaHandler
49
+ from reachy_mini_conversation_app.tools.core_tools import ToolDependencies
50
+ from reachy_mini_conversation_app.audio.head_wobbler import HeadWobbler
51
+
52
+ logger = setup_logger(args.debug)
53
+ logger.info("Starting Reachy Mini Conversation App")
54
+
55
+ if args.no_camera and args.head_tracker is not None:
56
+ logger.warning(
57
+ "Head tracking disabled: --no-camera flag is set. "
58
+ "Remove --no-camera to enable head tracking."
59
+ )
60
+
61
+ if robot is None:
62
+ try:
63
+ robot_kwargs = {}
64
+ if args.robot_name is not None:
65
+ robot_kwargs["robot_name"] = args.robot_name
66
+
67
+ logger.info("Initializing ReachyMini (SDK will auto-detect appropriate backend)")
68
+ robot = ReachyMini(**robot_kwargs)
69
+
70
+ except TimeoutError as e:
71
+ logger.error(
72
+ "Connection timeout: Failed to connect to Reachy Mini daemon. "
73
+ f"Details: {e}"
74
+ )
75
+ log_connection_troubleshooting(logger, args.robot_name)
76
+ sys.exit(1)
77
+
78
+ except ConnectionError as e:
79
+ logger.error(
80
+ "Connection failed: Unable to establish connection to Reachy Mini. "
81
+ f"Details: {e}"
82
+ )
83
+ log_connection_troubleshooting(logger, args.robot_name)
84
+ sys.exit(1)
85
+
86
+ except Exception as e:
87
+ logger.error(
88
+ f"Unexpected error during robot initialization: {type(e).__name__}: {e}"
89
+ )
90
+ logger.error("Please check your configuration and try again.")
91
+ sys.exit(1)
92
+
93
+ # Auto-enable Gradio in simulation mode (both MuJoCo for deamon and mockup-sim for desktop app)
94
+ status = robot.client.get_status()
95
+ is_simulation = status.get("simulation_enabled", False) or status.get("mockup_sim_enabled", False)
96
+
97
+ if is_simulation and not args.gradio:
98
+ logger.info("Simulation mode detected. Automatically enabling gradio flag.")
99
+ args.gradio = True
100
+
101
+ camera_worker, _, vision_manager = handle_vision_stuff(args, robot)
102
+
103
+ movement_manager = MovementManager(
104
+ current_robot=robot,
105
+ camera_worker=camera_worker,
106
+ )
107
+
108
+ head_wobbler = HeadWobbler(set_speech_offsets=movement_manager.set_speech_offsets)
109
+
110
+ deps = ToolDependencies(
111
+ reachy_mini=robot,
112
+ movement_manager=movement_manager,
113
+ camera_worker=camera_worker,
114
+ vision_manager=vision_manager,
115
+ head_wobbler=head_wobbler,
116
+ )
117
+ current_file_path = os.path.dirname(os.path.abspath(__file__))
118
+ logger.debug(f"Current file absolute path: {current_file_path}")
119
+ chatbot = gr.Chatbot(
120
+ type="messages",
121
+ resizable=True,
122
+ avatar_images=(
123
+ os.path.join(current_file_path, "images", "user_avatar.png"),
124
+ os.path.join(current_file_path, "images", "reachymini_avatar.png"),
125
+ ),
126
+ )
127
+ logger.debug(f"Chatbot avatar images: {chatbot.avatar_images}")
128
+
129
+ handler = OllamaHandler(deps, gradio_mode=args.gradio, instance_path=instance_path)
130
+
131
+ stream_manager: gr.Blocks | LocalStream | None = None
132
+
133
+ if args.gradio:
134
+ from reachy_mini_conversation_app.gradio_personality import PersonalityUI
135
+
136
+ personality_ui = PersonalityUI()
137
+ personality_ui.create_components()
138
+
139
+ stream = Stream(
140
+ handler=handler,
141
+ mode="send-receive",
142
+ modality="audio",
143
+ additional_inputs=[
144
+ chatbot,
145
+ *personality_ui.additional_inputs_ordered(),
146
+ ],
147
+ additional_outputs=[chatbot],
148
+ additional_outputs_handler=update_chatbot,
149
+ ui_args={"title": "Talk with Reachy Mini"},
150
+ )
151
+ stream_manager = stream.ui
152
+ if not settings_app:
153
+ app = FastAPI()
154
+ else:
155
+ app = settings_app
156
+
157
+ personality_ui.wire_events(handler, stream_manager)
158
+
159
+ app = gr.mount_gradio_app(app, stream.ui, path="/")
160
+ else:
161
+ # In headless mode, wire settings_app + instance_path to console LocalStream
162
+ stream_manager = LocalStream(
163
+ handler,
164
+ robot,
165
+ settings_app=settings_app,
166
+ instance_path=instance_path,
167
+ )
168
+
169
+ # Each async service → its own thread/loop
170
+ movement_manager.start()
171
+ head_wobbler.start()
172
+ if camera_worker:
173
+ camera_worker.start()
174
+ if vision_manager:
175
+ vision_manager.start()
176
+
177
+ def poll_stop_event() -> None:
178
+ """Poll the stop event to allow graceful shutdown."""
179
+ if app_stop_event is not None:
180
+ app_stop_event.wait()
181
+
182
+ logger.info("App stop event detected, shutting down...")
183
+ try:
184
+ stream_manager.close()
185
+ except Exception as e:
186
+ logger.error(f"Error while closing stream manager: {e}")
187
+
188
+ if app_stop_event:
189
+ threading.Thread(target=poll_stop_event, daemon=True).start()
190
+
191
+ try:
192
+ stream_manager.launch()
193
+ except KeyboardInterrupt:
194
+ logger.info("Keyboard interruption in main thread... closing server.")
195
+ finally:
196
+ movement_manager.stop()
197
+ head_wobbler.stop()
198
+ if camera_worker:
199
+ camera_worker.stop()
200
+ if vision_manager:
201
+ vision_manager.stop()
202
+
203
+ # Ensure media is explicitly closed before disconnecting
204
+ try:
205
+ robot.media.close()
206
+ except Exception as e:
207
+ logger.debug(f"Error closing media during shutdown: {e}")
208
+
209
+ # prevent connection to keep alive some threads
210
+ robot.client.disconnect()
211
+ time.sleep(1)
212
+ logger.info("Shutdown complete.")
213
+
214
+
215
+ class ReachyMiniConversationApp(ReachyMiniApp): # type: ignore[misc]
216
+ """Reachy Mini Apps entry point for the conversation app."""
217
+
218
+ custom_app_url = "http://0.0.0.0:7860/"
219
+ dont_start_webserver = False
220
+
221
+ def run(self, reachy_mini: ReachyMini, stop_event: threading.Event) -> None:
222
+ """Run the Reachy Mini conversation app."""
223
+ loop = asyncio.new_event_loop()
224
+ asyncio.set_event_loop(loop)
225
+
226
+ args, _ = parse_args()
227
+
228
+ # is_wireless = reachy_mini.client.get_status()["wireless_version"]
229
+ # args.head_tracker = None if is_wireless else "mediapipe"
230
+
231
+ instance_path = self._get_instance_path().parent
232
+ run(
233
+ args,
234
+ robot=reachy_mini,
235
+ app_stop_event=stop_event,
236
+ settings_app=self.settings_app,
237
+ instance_path=instance_path,
238
+ )
239
+
240
+
241
+ if __name__ == "__main__":
242
+ app = ReachyMiniConversationApp()
243
+ try:
244
+ app.wrapped_run()
245
+ except KeyboardInterrupt:
246
+ app.stop()
src/reachy_mini_conversation_app/moves.py ADDED
@@ -0,0 +1,849 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Movement system with sequential primary moves and additive secondary moves.
2
+
3
+ Design overview
4
+ - Primary moves (emotions, dances, goto, breathing) are mutually exclusive and run
5
+ sequentially.
6
+ - Secondary moves (speech sway, face tracking) are additive offsets applied on top
7
+ of the current primary pose.
8
+ - There is a single control point to the robot: `ReachyMini.set_target`.
9
+ - The control loop runs near 100 Hz and is phase-aligned via a monotonic clock.
10
+ - Idle behaviour starts an infinite `BreathingMove` after a short inactivity delay
11
+ unless listening is active.
12
+
13
+ Threading model
14
+ - A dedicated worker thread owns all real-time state and issues `set_target`
15
+ commands.
16
+ - Other threads communicate via a command queue (enqueue moves, mark activity,
17
+ toggle listening).
18
+ - Secondary offset producers set pending values guarded by locks; the worker
19
+ snaps them atomically.
20
+
21
+ Units and frames
22
+ - Secondary offsets are interpreted as metres for x/y/z and radians for
23
+ roll/pitch/yaw in the world frame (unless noted by `compose_world_offset`).
24
+ - Antennas and `body_yaw` are in radians.
25
+ - Head pose composition uses `compose_world_offset(primary_head, secondary_head)`;
26
+ the secondary offset must therefore be expressed in the world frame.
27
+
28
+ Safety
29
+ - Listening freezes antennas, then blends them back on unfreeze.
30
+ - Interpolations and blends are used to avoid jumps at all times.
31
+ - `set_target` errors are rate-limited in logs.
32
+ """
33
+
34
+ from __future__ import annotations
35
+ import time
36
+ import logging
37
+ import threading
38
+ from queue import Empty, Queue
39
+ from typing import Any, Dict, Tuple
40
+ from collections import deque
41
+ from dataclasses import dataclass
42
+
43
+ import numpy as np
44
+ from numpy.typing import NDArray
45
+
46
+ from reachy_mini import ReachyMini
47
+ from reachy_mini.utils import create_head_pose
48
+ from reachy_mini.motion.move import Move
49
+ from reachy_mini.utils.interpolation import (
50
+ compose_world_offset,
51
+ linear_pose_interpolation,
52
+ )
53
+
54
+
55
+ logger = logging.getLogger(__name__)
56
+
57
+ # Configuration constants
58
+ CONTROL_LOOP_FREQUENCY_HZ = 100.0 # Hz - Target frequency for the movement control loop
59
+
60
+ # Type definitions
61
+ FullBodyPose = Tuple[NDArray[np.float32], Tuple[float, float], float] # (head_pose_4x4, antennas, body_yaw)
62
+
63
+
64
+ class BreathingMove(Move): # type: ignore
65
+ """Breathing move with interpolation to neutral and then continuous breathing patterns."""
66
+
67
+ def __init__(
68
+ self,
69
+ interpolation_start_pose: NDArray[np.float32],
70
+ interpolation_start_antennas: Tuple[float, float],
71
+ interpolation_duration: float = 1.0,
72
+ ):
73
+ """Initialize breathing move.
74
+
75
+ Args:
76
+ interpolation_start_pose: 4x4 matrix of current head pose to interpolate from
77
+ interpolation_start_antennas: Current antenna positions to interpolate from
78
+ interpolation_duration: Duration of interpolation to neutral (seconds)
79
+
80
+ """
81
+ self.interpolation_start_pose = interpolation_start_pose
82
+ self.interpolation_start_antennas = np.array(interpolation_start_antennas)
83
+ self.interpolation_duration = interpolation_duration
84
+
85
+ # Neutral positions for breathing base
86
+ self.neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
87
+ self.neutral_antennas = np.array([0.0, 0.0])
88
+
89
+ # Breathing parameters
90
+ self.breathing_z_amplitude = 0.005 # 5mm gentle breathing
91
+ self.breathing_frequency = 0.1 # Hz (6 breaths per minute)
92
+ self.antenna_sway_amplitude = np.deg2rad(15) # 15 degrees
93
+ self.antenna_frequency = 0.5 # Hz (faster antenna sway)
94
+
95
+ @property
96
+ def duration(self) -> float:
97
+ """Duration property required by official Move interface."""
98
+ return float("inf") # Continuous breathing (never ends naturally)
99
+
100
+ def evaluate(self, t: float) -> tuple[NDArray[np.float64] | None, NDArray[np.float64] | None, float | None]:
101
+ """Evaluate breathing move at time t."""
102
+ if t < self.interpolation_duration:
103
+ # Phase 1: Interpolate to neutral base position
104
+ interpolation_t = t / self.interpolation_duration
105
+
106
+ # Interpolate head pose
107
+ head_pose = linear_pose_interpolation(
108
+ self.interpolation_start_pose, self.neutral_head_pose, interpolation_t,
109
+ )
110
+
111
+ # Interpolate antennas
112
+ antennas_interp = (
113
+ 1 - interpolation_t
114
+ ) * self.interpolation_start_antennas + interpolation_t * self.neutral_antennas
115
+ antennas = antennas_interp.astype(np.float64)
116
+
117
+ else:
118
+ # Phase 2: Breathing patterns from neutral base
119
+ breathing_time = t - self.interpolation_duration
120
+
121
+ # Gentle z-axis breathing
122
+ z_offset = self.breathing_z_amplitude * np.sin(2 * np.pi * self.breathing_frequency * breathing_time)
123
+ head_pose = create_head_pose(x=0, y=0, z=z_offset, roll=0, pitch=0, yaw=0, degrees=True, mm=False)
124
+
125
+ # Antenna sway (opposite directions)
126
+ antenna_sway = self.antenna_sway_amplitude * np.sin(2 * np.pi * self.antenna_frequency * breathing_time)
127
+ antennas = np.array([antenna_sway, -antenna_sway], dtype=np.float64)
128
+
129
+ # Return in official Move interface format: (head_pose, antennas_array, body_yaw)
130
+ return (head_pose, antennas, 0.0)
131
+
132
+
133
+ def combine_full_body(primary_pose: FullBodyPose, secondary_pose: FullBodyPose) -> FullBodyPose:
134
+ """Combine primary and secondary full body poses.
135
+
136
+ Args:
137
+ primary_pose: (head_pose, antennas, body_yaw) - primary move
138
+ secondary_pose: (head_pose, antennas, body_yaw) - secondary offsets
139
+
140
+ Returns:
141
+ Combined full body pose (head_pose, antennas, body_yaw)
142
+
143
+ """
144
+ primary_head, primary_antennas, primary_body_yaw = primary_pose
145
+ secondary_head, secondary_antennas, secondary_body_yaw = secondary_pose
146
+
147
+ # Combine head poses using compose_world_offset; the secondary pose must be an
148
+ # offset expressed in the world frame (T_off_world) applied to the absolute
149
+ # primary transform (T_abs).
150
+ combined_head = compose_world_offset(primary_head, secondary_head, reorthonormalize=True)
151
+
152
+ # Sum antennas and body_yaw
153
+ combined_antennas = (
154
+ primary_antennas[0] + secondary_antennas[0],
155
+ primary_antennas[1] + secondary_antennas[1],
156
+ )
157
+ combined_body_yaw = primary_body_yaw + secondary_body_yaw
158
+
159
+ return (combined_head, combined_antennas, combined_body_yaw)
160
+
161
+
162
+ def clone_full_body_pose(pose: FullBodyPose) -> FullBodyPose:
163
+ """Create a deep copy of a full body pose tuple."""
164
+ head, antennas, body_yaw = pose
165
+ return (head.copy(), (float(antennas[0]), float(antennas[1])), float(body_yaw))
166
+
167
+
168
+ @dataclass
169
+ class MovementState:
170
+ """State tracking for the movement system."""
171
+
172
+ # Primary move state
173
+ current_move: Move | None = None
174
+ move_start_time: float | None = None
175
+ last_activity_time: float = 0.0
176
+
177
+ # Secondary move state (offsets)
178
+ speech_offsets: Tuple[float, float, float, float, float, float] = (
179
+ 0.0,
180
+ 0.0,
181
+ 0.0,
182
+ 0.0,
183
+ 0.0,
184
+ 0.0,
185
+ )
186
+ face_tracking_offsets: Tuple[float, float, float, float, float, float] = (
187
+ 0.0,
188
+ 0.0,
189
+ 0.0,
190
+ 0.0,
191
+ 0.0,
192
+ 0.0,
193
+ )
194
+
195
+ # Status flags
196
+ last_primary_pose: FullBodyPose | None = None
197
+
198
+ def update_activity(self) -> None:
199
+ """Update the last activity time."""
200
+ self.last_activity_time = time.monotonic()
201
+
202
+
203
+ @dataclass
204
+ class LoopFrequencyStats:
205
+ """Track rolling loop frequency statistics."""
206
+
207
+ mean: float = 0.0
208
+ m2: float = 0.0
209
+ min_freq: float = float("inf")
210
+ count: int = 0
211
+ last_freq: float = 0.0
212
+ potential_freq: float = 0.0
213
+
214
+ def reset(self) -> None:
215
+ """Reset accumulators while keeping the last potential frequency."""
216
+ self.mean = 0.0
217
+ self.m2 = 0.0
218
+ self.min_freq = float("inf")
219
+ self.count = 0
220
+
221
+
222
+ class MovementManager:
223
+ """Coordinate sequential moves, additive offsets, and robot output at 100 Hz.
224
+
225
+ Responsibilities:
226
+ - Own a real-time loop that samples the current primary move (if any), fuses
227
+ secondary offsets, and calls `set_target` exactly once per tick.
228
+ - Start an idle `BreathingMove` after `idle_inactivity_delay` when not
229
+ listening and no moves are queued.
230
+ - Expose thread-safe APIs so other threads can enqueue moves, mark activity,
231
+ or feed secondary offsets without touching internal state.
232
+
233
+ Timing:
234
+ - All elapsed-time calculations rely on `time.monotonic()` through `self._now`
235
+ to avoid wall-clock jumps.
236
+ - The loop attempts 100 Hz
237
+
238
+ Concurrency:
239
+ - External threads communicate via `_command_queue` messages.
240
+ - Secondary offsets are staged via dirty flags guarded by locks and consumed
241
+ atomically inside the worker loop.
242
+ """
243
+
244
+ def __init__(
245
+ self,
246
+ current_robot: ReachyMini,
247
+ camera_worker: "Any" = None,
248
+ ):
249
+ """Initialize movement manager."""
250
+ self.current_robot = current_robot
251
+ self.camera_worker = camera_worker
252
+
253
+ # Single timing source for durations
254
+ self._now = time.monotonic
255
+
256
+ # Movement state
257
+ self.state = MovementState()
258
+ self.state.last_activity_time = self._now()
259
+ neutral_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
260
+ self.state.last_primary_pose = (neutral_pose, (0.0, 0.0), 0.0)
261
+
262
+ # Move queue (primary moves)
263
+ self.move_queue: deque[Move] = deque()
264
+
265
+ # Configuration
266
+ self.idle_inactivity_delay = 0.3 # seconds
267
+ self.target_frequency = CONTROL_LOOP_FREQUENCY_HZ
268
+ self.target_period = 1.0 / self.target_frequency
269
+
270
+ self._stop_event = threading.Event()
271
+ self._thread: threading.Thread | None = None
272
+ self._is_listening = False
273
+ self._last_commanded_pose: FullBodyPose = clone_full_body_pose(self.state.last_primary_pose)
274
+ self._listening_antennas: Tuple[float, float] = self._last_commanded_pose[1]
275
+ self._antenna_unfreeze_blend = 1.0
276
+ self._antenna_blend_duration = 0.4 # seconds to blend back after listening
277
+ self._last_listening_blend_time = self._now()
278
+ self._breathing_active = False # true when breathing move is running or queued
279
+ self._listening_debounce_s = 0.15
280
+ self._last_listening_toggle_time = self._now()
281
+ self._last_set_target_err = 0.0
282
+ self._set_target_err_interval = 1.0 # seconds between error logs
283
+ self._set_target_err_suppressed = 0
284
+
285
+ # Cross-thread signalling
286
+ self._command_queue: "Queue[Tuple[str, Any]]" = Queue()
287
+ self._speech_offsets_lock = threading.Lock()
288
+ self._pending_speech_offsets: Tuple[float, float, float, float, float, float] = (
289
+ 0.0,
290
+ 0.0,
291
+ 0.0,
292
+ 0.0,
293
+ 0.0,
294
+ 0.0,
295
+ )
296
+ self._speech_offsets_dirty = False
297
+
298
+ self._face_offsets_lock = threading.Lock()
299
+ self._pending_face_offsets: Tuple[float, float, float, float, float, float] = (
300
+ 0.0,
301
+ 0.0,
302
+ 0.0,
303
+ 0.0,
304
+ 0.0,
305
+ 0.0,
306
+ )
307
+ self._face_offsets_dirty = False
308
+
309
+ self._shared_state_lock = threading.Lock()
310
+ self._shared_last_activity_time = self.state.last_activity_time
311
+ self._shared_is_listening = self._is_listening
312
+ self._status_lock = threading.Lock()
313
+ self._freq_stats = LoopFrequencyStats()
314
+ self._freq_snapshot = LoopFrequencyStats()
315
+
316
+ def queue_move(self, move: Move) -> None:
317
+ """Queue a primary move to run after the currently executing one.
318
+
319
+ Thread-safe: the move is enqueued via the worker command queue so the
320
+ control loop remains the sole mutator of movement state.
321
+ """
322
+ self._command_queue.put(("queue_move", move))
323
+
324
+ def clear_move_queue(self) -> None:
325
+ """Stop the active move and discard any queued primary moves.
326
+
327
+ Thread-safe: executed by the worker thread via the command queue.
328
+ """
329
+ self._command_queue.put(("clear_queue", None))
330
+
331
+ def set_speech_offsets(self, offsets: Tuple[float, float, float, float, float, float]) -> None:
332
+ """Update speech-induced secondary offsets (x, y, z, roll, pitch, yaw).
333
+
334
+ Offsets are interpreted as metres for translation and radians for
335
+ rotation in the world frame. Thread-safe via a pending snapshot.
336
+ """
337
+ with self._speech_offsets_lock:
338
+ self._pending_speech_offsets = offsets
339
+ self._speech_offsets_dirty = True
340
+
341
+ def set_moving_state(self, duration: float) -> None:
342
+ """Mark the robot as actively moving for the provided duration.
343
+
344
+ Legacy hook used by goto helpers to keep inactivity and breathing logic
345
+ aware of manual motions. Thread-safe via the command queue.
346
+ """
347
+ self._command_queue.put(("set_moving_state", duration))
348
+
349
+ def is_idle(self) -> bool:
350
+ """Return True when the robot has been inactive longer than the idle delay."""
351
+ with self._shared_state_lock:
352
+ last_activity = self._shared_last_activity_time
353
+ listening = self._shared_is_listening
354
+
355
+ if listening:
356
+ return False
357
+
358
+ return self._now() - last_activity >= self.idle_inactivity_delay
359
+
360
+ def set_listening(self, listening: bool) -> None:
361
+ """Enable or disable listening mode without touching shared state directly.
362
+
363
+ While listening:
364
+ - Antenna positions are frozen at the last commanded values.
365
+ - Blending is reset so that upon unfreezing the antennas return smoothly.
366
+ - Idle breathing is suppressed.
367
+
368
+ Thread-safe: the change is posted to the worker command queue.
369
+ """
370
+ with self._shared_state_lock:
371
+ if self._shared_is_listening == listening:
372
+ return
373
+ self._command_queue.put(("set_listening", listening))
374
+
375
+ def _poll_signals(self, current_time: float) -> None:
376
+ """Apply queued commands and pending offset updates."""
377
+ self._apply_pending_offsets()
378
+
379
+ while True:
380
+ try:
381
+ command, payload = self._command_queue.get_nowait()
382
+ except Empty:
383
+ break
384
+ self._handle_command(command, payload, current_time)
385
+
386
+ def _apply_pending_offsets(self) -> None:
387
+ """Apply the most recent speech/face offset updates."""
388
+ speech_offsets: Tuple[float, float, float, float, float, float] | None = None
389
+ with self._speech_offsets_lock:
390
+ if self._speech_offsets_dirty:
391
+ speech_offsets = self._pending_speech_offsets
392
+ self._speech_offsets_dirty = False
393
+
394
+ if speech_offsets is not None:
395
+ self.state.speech_offsets = speech_offsets
396
+ self.state.update_activity()
397
+
398
+ face_offsets: Tuple[float, float, float, float, float, float] | None = None
399
+ with self._face_offsets_lock:
400
+ if self._face_offsets_dirty:
401
+ face_offsets = self._pending_face_offsets
402
+ self._face_offsets_dirty = False
403
+
404
+ if face_offsets is not None:
405
+ self.state.face_tracking_offsets = face_offsets
406
+ self.state.update_activity()
407
+
408
+ def _handle_command(self, command: str, payload: Any, current_time: float) -> None:
409
+ """Handle a single cross-thread command."""
410
+ if command == "queue_move":
411
+ if isinstance(payload, Move):
412
+ self.move_queue.append(payload)
413
+ self.state.update_activity()
414
+ duration = getattr(payload, "duration", None)
415
+ if duration is not None:
416
+ try:
417
+ duration_str = f"{float(duration):.2f}"
418
+ except (TypeError, ValueError):
419
+ duration_str = str(duration)
420
+ else:
421
+ duration_str = "?"
422
+ logger.debug(
423
+ "Queued move with duration %ss, queue size: %s",
424
+ duration_str,
425
+ len(self.move_queue),
426
+ )
427
+ else:
428
+ logger.warning("Ignored queue_move command with invalid payload: %s", payload)
429
+ elif command == "clear_queue":
430
+ self.move_queue.clear()
431
+ self.state.current_move = None
432
+ self.state.move_start_time = None
433
+ self._breathing_active = False
434
+ logger.info("Cleared move queue and stopped current move")
435
+ elif command == "set_moving_state":
436
+ try:
437
+ duration = float(payload)
438
+ except (TypeError, ValueError):
439
+ logger.warning("Invalid moving state duration: %s", payload)
440
+ return
441
+ self.state.update_activity()
442
+ elif command == "mark_activity":
443
+ self.state.update_activity()
444
+ elif command == "set_listening":
445
+ desired_state = bool(payload)
446
+ now = self._now()
447
+ if now - self._last_listening_toggle_time < self._listening_debounce_s:
448
+ return
449
+ self._last_listening_toggle_time = now
450
+
451
+ if self._is_listening == desired_state:
452
+ return
453
+
454
+ self._is_listening = desired_state
455
+ self._last_listening_blend_time = now
456
+ if desired_state:
457
+ # Freeze: snapshot current commanded antennas and reset blend
458
+ self._listening_antennas = (
459
+ float(self._last_commanded_pose[1][0]),
460
+ float(self._last_commanded_pose[1][1]),
461
+ )
462
+ self._antenna_unfreeze_blend = 0.0
463
+ else:
464
+ # Unfreeze: restart blending from frozen pose
465
+ self._antenna_unfreeze_blend = 0.0
466
+ self.state.update_activity()
467
+ else:
468
+ logger.warning("Unknown command received by MovementManager: %s", command)
469
+
470
+ def _publish_shared_state(self) -> None:
471
+ """Expose idle-related state for external threads."""
472
+ with self._shared_state_lock:
473
+ self._shared_last_activity_time = self.state.last_activity_time
474
+ self._shared_is_listening = self._is_listening
475
+
476
+ def _manage_move_queue(self, current_time: float) -> None:
477
+ """Manage the primary move queue (sequential execution)."""
478
+ if self.state.current_move is None or (
479
+ self.state.move_start_time is not None
480
+ and current_time - self.state.move_start_time >= self.state.current_move.duration
481
+ ):
482
+ self.state.current_move = None
483
+ self.state.move_start_time = None
484
+
485
+ if self.move_queue:
486
+ self.state.current_move = self.move_queue.popleft()
487
+ self.state.move_start_time = current_time
488
+ # Any real move cancels breathing mode flag
489
+ self._breathing_active = isinstance(self.state.current_move, BreathingMove)
490
+ logger.debug(f"Starting new move, duration: {self.state.current_move.duration}s")
491
+
492
+ def _manage_breathing(self, current_time: float) -> None:
493
+ """Manage automatic breathing when idle."""
494
+ if (
495
+ self.state.current_move is None
496
+ and not self.move_queue
497
+ and not self._is_listening
498
+ and not self._breathing_active
499
+ ):
500
+ idle_for = current_time - self.state.last_activity_time
501
+ if idle_for >= self.idle_inactivity_delay:
502
+ try:
503
+ # These 2 functions return the latest available sensor data from the robot, but don't perform I/O synchronously.
504
+ # Therefore, we accept calling them inside the control loop.
505
+ _, current_antennas = self.current_robot.get_current_joint_positions()
506
+ current_head_pose = self.current_robot.get_current_head_pose()
507
+
508
+ self._breathing_active = True
509
+ self.state.update_activity()
510
+
511
+ breathing_move = BreathingMove(
512
+ interpolation_start_pose=current_head_pose,
513
+ interpolation_start_antennas=current_antennas,
514
+ interpolation_duration=1.0,
515
+ )
516
+ self.move_queue.append(breathing_move)
517
+ logger.debug("Started breathing after %.1fs of inactivity", idle_for)
518
+ except Exception as e:
519
+ self._breathing_active = False
520
+ logger.error("Failed to start breathing: %s", e)
521
+
522
+ if isinstance(self.state.current_move, BreathingMove) and self.move_queue:
523
+ self.state.current_move = None
524
+ self.state.move_start_time = None
525
+ self._breathing_active = False
526
+ logger.debug("Stopping breathing due to new move activity")
527
+
528
+ if self.state.current_move is not None and not isinstance(self.state.current_move, BreathingMove):
529
+ self._breathing_active = False
530
+
531
+ def _get_primary_pose(self, current_time: float) -> FullBodyPose:
532
+ """Get the primary full body pose from current move or neutral."""
533
+ # When a primary move is playing, sample it and cache the resulting pose
534
+ if self.state.current_move is not None and self.state.move_start_time is not None:
535
+ move_time = current_time - self.state.move_start_time
536
+ head, antennas, body_yaw = self.state.current_move.evaluate(move_time)
537
+
538
+ if head is None:
539
+ head = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
540
+ if antennas is None:
541
+ antennas = np.array([0.0, 0.0])
542
+ if body_yaw is None:
543
+ body_yaw = 0.0
544
+
545
+ antennas_tuple = (float(antennas[0]), float(antennas[1]))
546
+ head_copy = head.copy()
547
+ primary_full_body_pose = (
548
+ head_copy,
549
+ antennas_tuple,
550
+ float(body_yaw),
551
+ )
552
+
553
+ self.state.last_primary_pose = clone_full_body_pose(primary_full_body_pose)
554
+ # Otherwise reuse the last primary pose so we avoid jumps between moves
555
+ elif self.state.last_primary_pose is not None:
556
+ primary_full_body_pose = clone_full_body_pose(self.state.last_primary_pose)
557
+ else:
558
+ neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
559
+ primary_full_body_pose = (neutral_head_pose, (0.0, 0.0), 0.0)
560
+ self.state.last_primary_pose = clone_full_body_pose(primary_full_body_pose)
561
+
562
+ return primary_full_body_pose
563
+
564
+ def _get_secondary_pose(self) -> FullBodyPose:
565
+ """Get the secondary full body pose from speech and face tracking offsets."""
566
+ # Combine speech sway offsets + face tracking offsets for secondary pose
567
+ secondary_offsets = [
568
+ self.state.speech_offsets[0] + self.state.face_tracking_offsets[0],
569
+ self.state.speech_offsets[1] + self.state.face_tracking_offsets[1],
570
+ self.state.speech_offsets[2] + self.state.face_tracking_offsets[2],
571
+ self.state.speech_offsets[3] + self.state.face_tracking_offsets[3],
572
+ self.state.speech_offsets[4] + self.state.face_tracking_offsets[4],
573
+ self.state.speech_offsets[5] + self.state.face_tracking_offsets[5],
574
+ ]
575
+
576
+ secondary_head_pose = create_head_pose(
577
+ x=secondary_offsets[0],
578
+ y=secondary_offsets[1],
579
+ z=secondary_offsets[2],
580
+ roll=secondary_offsets[3],
581
+ pitch=secondary_offsets[4],
582
+ yaw=secondary_offsets[5],
583
+ degrees=False,
584
+ mm=False,
585
+ )
586
+ return (secondary_head_pose, (0.0, 0.0), 0.0)
587
+
588
+ def _compose_full_body_pose(self, current_time: float) -> FullBodyPose:
589
+ """Compose primary and secondary poses into a single command pose."""
590
+ primary = self._get_primary_pose(current_time)
591
+ secondary = self._get_secondary_pose()
592
+ return combine_full_body(primary, secondary)
593
+
594
+ def _update_primary_motion(self, current_time: float) -> None:
595
+ """Advance queue state and idle behaviours for this tick."""
596
+ self._manage_move_queue(current_time)
597
+ self._manage_breathing(current_time)
598
+
599
+ def _calculate_blended_antennas(self, target_antennas: Tuple[float, float]) -> Tuple[float, float]:
600
+ """Blend target antennas with listening freeze state and update blending."""
601
+ now = self._now()
602
+ listening = self._is_listening
603
+ listening_antennas = self._listening_antennas
604
+ blend = self._antenna_unfreeze_blend
605
+ blend_duration = self._antenna_blend_duration
606
+ last_update = self._last_listening_blend_time
607
+ self._last_listening_blend_time = now
608
+
609
+ if listening:
610
+ antennas_cmd = listening_antennas
611
+ new_blend = 0.0
612
+ else:
613
+ dt = max(0.0, now - last_update)
614
+ if blend_duration <= 0:
615
+ new_blend = 1.0
616
+ else:
617
+ new_blend = min(1.0, blend + dt / blend_duration)
618
+ antennas_cmd = (
619
+ listening_antennas[0] * (1.0 - new_blend) + target_antennas[0] * new_blend,
620
+ listening_antennas[1] * (1.0 - new_blend) + target_antennas[1] * new_blend,
621
+ )
622
+
623
+ if listening:
624
+ self._antenna_unfreeze_blend = 0.0
625
+ else:
626
+ self._antenna_unfreeze_blend = new_blend
627
+ if new_blend >= 1.0:
628
+ self._listening_antennas = (
629
+ float(target_antennas[0]),
630
+ float(target_antennas[1]),
631
+ )
632
+
633
+ return antennas_cmd
634
+
635
+ def _issue_control_command(self, head: NDArray[np.float32], antennas: Tuple[float, float], body_yaw: float) -> None:
636
+ """Send the fused pose to the robot with throttled error logging."""
637
+ try:
638
+ self.current_robot.set_target(head=head, antennas=antennas, body_yaw=body_yaw)
639
+ except Exception as e:
640
+ now = self._now()
641
+ if now - self._last_set_target_err >= self._set_target_err_interval:
642
+ msg = f"Failed to set robot target: {e}"
643
+ if self._set_target_err_suppressed:
644
+ msg += f" (suppressed {self._set_target_err_suppressed} repeats)"
645
+ self._set_target_err_suppressed = 0
646
+ logger.error(msg)
647
+ self._last_set_target_err = now
648
+ else:
649
+ self._set_target_err_suppressed += 1
650
+ else:
651
+ with self._status_lock:
652
+ self._last_commanded_pose = clone_full_body_pose((head, antennas, body_yaw))
653
+
654
+ def _update_frequency_stats(
655
+ self, loop_start: float, prev_loop_start: float, stats: LoopFrequencyStats,
656
+ ) -> LoopFrequencyStats:
657
+ """Update frequency statistics based on the current loop start time."""
658
+ period = loop_start - prev_loop_start
659
+ if period > 0:
660
+ stats.last_freq = 1.0 / period
661
+ stats.count += 1
662
+ delta = stats.last_freq - stats.mean
663
+ stats.mean += delta / stats.count
664
+ stats.m2 += delta * (stats.last_freq - stats.mean)
665
+ stats.min_freq = min(stats.min_freq, stats.last_freq)
666
+ return stats
667
+
668
+ def _schedule_next_tick(self, loop_start: float, stats: LoopFrequencyStats) -> Tuple[float, LoopFrequencyStats]:
669
+ """Compute sleep time to maintain target frequency and update potential freq."""
670
+ computation_time = self._now() - loop_start
671
+ stats.potential_freq = 1.0 / computation_time if computation_time > 0 else float("inf")
672
+ sleep_time = max(0.0, self.target_period - computation_time)
673
+ return sleep_time, stats
674
+
675
+ def _record_frequency_snapshot(self, stats: LoopFrequencyStats) -> None:
676
+ """Store a thread-safe snapshot of current frequency statistics."""
677
+ with self._status_lock:
678
+ self._freq_snapshot = LoopFrequencyStats(
679
+ mean=stats.mean,
680
+ m2=stats.m2,
681
+ min_freq=stats.min_freq,
682
+ count=stats.count,
683
+ last_freq=stats.last_freq,
684
+ potential_freq=stats.potential_freq,
685
+ )
686
+
687
+ def _maybe_log_frequency(self, loop_count: int, print_interval_loops: int, stats: LoopFrequencyStats) -> None:
688
+ """Emit frequency telemetry when enough loops have elapsed."""
689
+ if loop_count % print_interval_loops != 0 or stats.count == 0:
690
+ return
691
+
692
+ variance = stats.m2 / stats.count if stats.count > 0 else 0.0
693
+ lowest = stats.min_freq if stats.min_freq != float("inf") else 0.0
694
+ logger.debug(
695
+ "Loop freq - avg: %.2fHz, variance: %.4f, min: %.2fHz, last: %.2fHz, potential: %.2fHz, target: %.1fHz",
696
+ stats.mean,
697
+ variance,
698
+ lowest,
699
+ stats.last_freq,
700
+ stats.potential_freq,
701
+ self.target_frequency,
702
+ )
703
+ stats.reset()
704
+
705
+ def _update_face_tracking(self, current_time: float) -> None:
706
+ """Get face tracking offsets from camera worker thread."""
707
+ if self.camera_worker is not None:
708
+ # Get face tracking offsets from camera worker thread
709
+ offsets = self.camera_worker.get_face_tracking_offsets()
710
+ self.state.face_tracking_offsets = offsets
711
+ else:
712
+ # No camera worker, use neutral offsets
713
+ self.state.face_tracking_offsets = (0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
714
+
715
+ def start(self) -> None:
716
+ """Start the worker thread that drives the 100 Hz control loop."""
717
+ if self._thread is not None and self._thread.is_alive():
718
+ logger.warning("Move worker already running; start() ignored")
719
+ return
720
+ self._stop_event.clear()
721
+ self._thread = threading.Thread(target=self.working_loop, daemon=True)
722
+ self._thread.start()
723
+ logger.debug("Move worker started")
724
+
725
+ def stop(self) -> None:
726
+ """Request the worker thread to stop and wait for it to exit.
727
+
728
+ Before stopping, resets the robot to a neutral position.
729
+ """
730
+ if self._thread is None or not self._thread.is_alive():
731
+ logger.debug("Move worker not running; stop() ignored")
732
+ return
733
+
734
+ logger.info("Stopping movement manager and resetting to neutral position...")
735
+
736
+ # Clear any queued moves and stop current move
737
+ self.clear_move_queue()
738
+
739
+ # Stop the worker thread first so it doesn't interfere
740
+ self._stop_event.set()
741
+ if self._thread is not None:
742
+ self._thread.join()
743
+ self._thread = None
744
+ logger.debug("Move worker stopped")
745
+
746
+ # Reset to neutral position using goto_target (same approach as wake_up)
747
+ try:
748
+ neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
749
+ neutral_antennas = [0.0, 0.0]
750
+ neutral_body_yaw = 0.0
751
+
752
+ # Use goto_target directly on the robot
753
+ self.current_robot.goto_target(
754
+ head=neutral_head_pose,
755
+ antennas=neutral_antennas,
756
+ duration=2.0,
757
+ body_yaw=neutral_body_yaw,
758
+ )
759
+
760
+ logger.info("Reset to neutral position completed")
761
+
762
+ except Exception as e:
763
+ logger.error(f"Failed to reset to neutral position: {e}")
764
+
765
+ def get_status(self) -> Dict[str, Any]:
766
+ """Return a lightweight status snapshot for observability."""
767
+ with self._status_lock:
768
+ pose_snapshot = clone_full_body_pose(self._last_commanded_pose)
769
+ freq_snapshot = LoopFrequencyStats(
770
+ mean=self._freq_snapshot.mean,
771
+ m2=self._freq_snapshot.m2,
772
+ min_freq=self._freq_snapshot.min_freq,
773
+ count=self._freq_snapshot.count,
774
+ last_freq=self._freq_snapshot.last_freq,
775
+ potential_freq=self._freq_snapshot.potential_freq,
776
+ )
777
+
778
+ head_matrix = pose_snapshot[0].tolist() if pose_snapshot else None
779
+ antennas = pose_snapshot[1] if pose_snapshot else None
780
+ body_yaw = pose_snapshot[2] if pose_snapshot else None
781
+
782
+ return {
783
+ "queue_size": len(self.move_queue),
784
+ "is_listening": self._is_listening,
785
+ "breathing_active": self._breathing_active,
786
+ "last_commanded_pose": {
787
+ "head": head_matrix,
788
+ "antennas": antennas,
789
+ "body_yaw": body_yaw,
790
+ },
791
+ "loop_frequency": {
792
+ "last": freq_snapshot.last_freq,
793
+ "mean": freq_snapshot.mean,
794
+ "min": freq_snapshot.min_freq,
795
+ "potential": freq_snapshot.potential_freq,
796
+ "samples": freq_snapshot.count,
797
+ },
798
+ }
799
+
800
+ def working_loop(self) -> None:
801
+ """Control loop main movements - reproduces main_works.py control architecture.
802
+
803
+ Single set_target() call with pose fusion.
804
+ """
805
+ logger.debug("Starting enhanced movement control loop (100Hz)")
806
+
807
+ loop_count = 0
808
+ prev_loop_start = self._now()
809
+ print_interval_loops = max(1, int(self.target_frequency * 2))
810
+ freq_stats = self._freq_stats
811
+
812
+ while not self._stop_event.is_set():
813
+ loop_start = self._now()
814
+ loop_count += 1
815
+
816
+ if loop_count > 1:
817
+ freq_stats = self._update_frequency_stats(loop_start, prev_loop_start, freq_stats)
818
+ prev_loop_start = loop_start
819
+
820
+ # 1) Poll external commands and apply pending offsets (atomic snapshot)
821
+ self._poll_signals(loop_start)
822
+
823
+ # 2) Manage the primary move queue (start new move, end finished move, breathing)
824
+ self._update_primary_motion(loop_start)
825
+
826
+ # 3) Update vision-based secondary offsets
827
+ self._update_face_tracking(loop_start)
828
+
829
+ # 4) Build primary and secondary full-body poses, then fuse them
830
+ head, antennas, body_yaw = self._compose_full_body_pose(loop_start)
831
+
832
+ # 5) Apply listening antenna freeze or blend-back
833
+ antennas_cmd = self._calculate_blended_antennas(antennas)
834
+
835
+ # 6) Single set_target call - the only control point
836
+ self._issue_control_command(head, antennas_cmd, body_yaw)
837
+
838
+ # 7) Adaptive sleep to align to next tick, then publish shared state
839
+ sleep_time, freq_stats = self._schedule_next_tick(loop_start, freq_stats)
840
+ self._publish_shared_state()
841
+ self._record_frequency_snapshot(freq_stats)
842
+
843
+ # 8) Periodic telemetry on loop frequency
844
+ self._maybe_log_frequency(loop_count, print_interval_loops, freq_stats)
845
+
846
+ if sleep_time > 0:
847
+ time.sleep(sleep_time)
848
+
849
+ logger.debug("Movement control loop stopped")
src/reachy_mini_conversation_app/ollama_handler.py ADDED
@@ -0,0 +1,558 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Ollama-based conversation handler with local STT (faster-whisper) and TTS (edge-tts).
2
+
3
+ Replaces the previous OpenAI Realtime API handler with a fully local/self-hosted
4
+ pipeline:
5
+ Audio In → faster-whisper (STT) → Ollama (LLM + tools) → edge-tts (TTS) → Audio Out
6
+ """
7
+
8
+ import io
9
+ import json
10
+ import base64
11
+ import asyncio
12
+ import logging
13
+ from typing import Any, Final, Tuple, Literal, Optional
14
+ from pathlib import Path
15
+ from datetime import datetime
16
+
17
+ import cv2
18
+ import numpy as np
19
+ import gradio as gr
20
+ import edge_tts
21
+ import miniaudio
22
+ from ollama import AsyncClient as OllamaAsyncClient
23
+ from fastrtc import AdditionalOutputs, AsyncStreamHandler, wait_for_item, audio_to_int16
24
+ from numpy.typing import NDArray
25
+ from scipy.signal import resample
26
+
27
+ from reachy_mini_conversation_app.config import config
28
+ from reachy_mini_conversation_app.prompts import get_session_voice, get_session_instructions
29
+ from reachy_mini_conversation_app.tools.core_tools import (
30
+ ToolDependencies,
31
+ get_tool_specs,
32
+ dispatch_tool_call,
33
+ )
34
+
35
+
36
+ logger = logging.getLogger(__name__)
37
+
38
+ HANDLER_SAMPLE_RATE: Final[int] = 24000
39
+ WHISPER_SAMPLE_RATE: Final[int] = 16000
40
+
41
+ # Voice-activity detection thresholds
42
+ SILENCE_RMS_THRESHOLD: Final[float] = 500.0
43
+ SILENCE_DURATION_S: Final[float] = 0.8 # seconds of silence to end utterance
44
+ MIN_SPEECH_DURATION_S: Final[float] = 0.3 # discard very short bursts
45
+
46
+
47
+ class OllamaHandler(AsyncStreamHandler):
48
+ """Conversation handler using Ollama (LLM), faster-whisper (STT), and edge-tts (TTS)."""
49
+
50
+ def __init__(
51
+ self,
52
+ deps: ToolDependencies,
53
+ gradio_mode: bool = False,
54
+ instance_path: Optional[str] = None,
55
+ ):
56
+ """Initialize the handler."""
57
+ super().__init__(
58
+ expected_layout="mono",
59
+ output_sample_rate=HANDLER_SAMPLE_RATE,
60
+ input_sample_rate=HANDLER_SAMPLE_RATE,
61
+ )
62
+
63
+ self.deps = deps
64
+ self.gradio_mode = gradio_mode
65
+ self.instance_path = instance_path
66
+
67
+ # Output queue (audio frames + AdditionalOutputs for chat UI)
68
+ self.output_queue: "asyncio.Queue[Tuple[int, NDArray[np.int16]] | AdditionalOutputs]" = asyncio.Queue()
69
+
70
+ # Clients (initialized in start_up)
71
+ self.ollama_client: OllamaAsyncClient | None = None
72
+ self.whisper_model: Any = None # faster_whisper.WhisperModel
73
+
74
+ # Conversation history
75
+ self._messages: list[dict[str, Any]] = []
76
+
77
+ # Audio buffering for VAD + STT
78
+ self._audio_buffer: list[NDArray[np.int16]] = []
79
+ self._is_speaking: bool = False
80
+ self._silence_frame_count: int = 0
81
+ self._speech_frame_count: int = 0
82
+
83
+ # Timing
84
+ self.last_activity_time = asyncio.get_event_loop().time()
85
+ self.start_time = asyncio.get_event_loop().time()
86
+ self.is_idle_tool_call: bool = False
87
+
88
+ # TTS voice (resolved from profile or config)
89
+ self._tts_voice: str = config.TTS_VOICE
90
+
91
+ # Lifecycle flags
92
+ self._shutdown_requested: bool = False
93
+ self._connected_event: asyncio.Event = asyncio.Event()
94
+
95
+ # Debouncing for partial transcripts
96
+ self.partial_transcript_task: asyncio.Task[None] | None = None
97
+ self.partial_transcript_sequence: int = 0
98
+ self.partial_debounce_delay = 0.5
99
+
100
+ def copy(self) -> "OllamaHandler":
101
+ """Create a copy of the handler."""
102
+ return OllamaHandler(self.deps, self.gradio_mode, self.instance_path)
103
+
104
+ # ------------------------------------------------------------------ #
105
+ # Startup & lifecycle
106
+ # ------------------------------------------------------------------ #
107
+
108
+ async def start_up(self) -> None:
109
+ """Initialize STT, LLM client, and keep running until shutdown."""
110
+ # 1. Initialize Ollama client
111
+ self.ollama_client = OllamaAsyncClient(host=config.OLLAMA_BASE_URL)
112
+
113
+ # 2. Verify Ollama connectivity
114
+ try:
115
+ await self.ollama_client.list()
116
+ logger.info("Connected to Ollama at %s", config.OLLAMA_BASE_URL)
117
+ except Exception as e:
118
+ logger.error("Cannot reach Ollama at %s: %s", config.OLLAMA_BASE_URL, e)
119
+ logger.warning("Proceeding anyway; requests will fail until Ollama is available.")
120
+
121
+ # 3. Initialize faster-whisper STT
122
+ try:
123
+ from faster_whisper import WhisperModel
124
+
125
+ self.whisper_model = WhisperModel(
126
+ config.STT_MODEL,
127
+ device="auto",
128
+ compute_type="int8",
129
+ )
130
+ logger.info("Loaded faster-whisper model: %s", config.STT_MODEL)
131
+ except Exception as e:
132
+ logger.error("Failed to load STT model '%s': %s", config.STT_MODEL, e)
133
+ logger.warning("Speech-to-text will be unavailable.")
134
+
135
+ # 4. Set up conversation with system prompt
136
+ instructions = get_session_instructions()
137
+ self._messages = [{"role": "system", "content": instructions}]
138
+ self._tts_voice = config.TTS_VOICE
139
+
140
+ self._connected_event.set()
141
+ logger.info(
142
+ "OllamaHandler ready — model=%s stt=%s tts_voice=%s",
143
+ config.MODEL_NAME,
144
+ config.STT_MODEL,
145
+ self._tts_voice,
146
+ )
147
+
148
+ # Keep the handler alive until shutdown is requested
149
+ while not self._shutdown_requested:
150
+ await asyncio.sleep(0.1)
151
+
152
+ # ------------------------------------------------------------------ #
153
+ # Personality / session management
154
+ # ------------------------------------------------------------------ #
155
+
156
+ async def apply_personality(self, profile: str | None) -> str:
157
+ """Apply a new personality (profile) at runtime.
158
+
159
+ Updates the system prompt and resets conversation history so the new
160
+ personality takes effect immediately.
161
+ """
162
+ try:
163
+ from reachy_mini_conversation_app.config import config as _config
164
+ from reachy_mini_conversation_app.config import set_custom_profile
165
+
166
+ set_custom_profile(profile)
167
+ logger.info(
168
+ "Set custom profile to %r (config=%r)",
169
+ profile,
170
+ getattr(_config, "REACHY_MINI_CUSTOM_PROFILE", None),
171
+ )
172
+
173
+ try:
174
+ instructions = get_session_instructions()
175
+ except BaseException as e:
176
+ logger.error("Failed to resolve personality content: %s", e)
177
+ return f"Failed to apply personality: {e}"
178
+
179
+ # Reset conversation with new system prompt
180
+ self._messages = [{"role": "system", "content": instructions}]
181
+ logger.info("Applied personality: %s", profile or "built-in default")
182
+ return "Applied personality. Active on next message."
183
+ except Exception as e:
184
+ logger.error("Error applying personality '%s': %s", profile, e)
185
+ return f"Failed to apply personality: {e}"
186
+
187
+ async def _restart_session(self) -> None:
188
+ """Reset conversation history (equivalent of restarting a session)."""
189
+ try:
190
+ instructions = get_session_instructions()
191
+ self._messages = [{"role": "system", "content": instructions}]
192
+ logger.info("Session reset (conversation history cleared).")
193
+ except Exception as e:
194
+ logger.warning("_restart_session failed: %s", e)
195
+
196
+ # ------------------------------------------------------------------ #
197
+ # Audio receive (microphone) → VAD → STT → LLM → TTS → emit
198
+ # ------------------------------------------------------------------ #
199
+
200
+ async def receive(self, frame: Tuple[int, NDArray[np.int16]]) -> None:
201
+ """Receive audio frame from the microphone and run VAD.
202
+
203
+ When the user finishes speaking (silence detected), kicks off the
204
+ speech-processing pipeline in a background task.
205
+ """
206
+ if self._shutdown_requested or self.whisper_model is None:
207
+ return
208
+
209
+ input_sample_rate, audio_frame = frame
210
+
211
+ # Reshape to 1-D mono
212
+ if audio_frame.ndim == 2:
213
+ if audio_frame.shape[1] > audio_frame.shape[0]:
214
+ audio_frame = audio_frame.T
215
+ if audio_frame.shape[1] > 1:
216
+ audio_frame = audio_frame[:, 0]
217
+
218
+ # Resample to handler rate if necessary
219
+ if input_sample_rate != HANDLER_SAMPLE_RATE:
220
+ audio_frame = resample(
221
+ audio_frame, int(len(audio_frame) * HANDLER_SAMPLE_RATE / input_sample_rate)
222
+ )
223
+
224
+ audio_frame = audio_to_int16(audio_frame)
225
+
226
+ # --- simple energy-based VAD ---
227
+ rms = float(np.sqrt(np.mean(audio_frame.astype(np.float32) ** 2)))
228
+ frame_duration = len(audio_frame) / HANDLER_SAMPLE_RATE
229
+
230
+ if rms > SILENCE_RMS_THRESHOLD:
231
+ # Voice activity detected
232
+ if not self._is_speaking:
233
+ self._is_speaking = True
234
+ self._speech_frame_count = 0
235
+ if self.deps.head_wobbler is not None:
236
+ self.deps.head_wobbler.reset()
237
+ self.deps.movement_manager.set_listening(True)
238
+ logger.debug("Speech started (RMS=%.0f)", rms)
239
+ self._silence_frame_count = 0
240
+ self._speech_frame_count += 1
241
+ self._audio_buffer.append(audio_frame)
242
+ else:
243
+ if self._is_speaking:
244
+ self._silence_frame_count += 1
245
+ self._audio_buffer.append(audio_frame) # keep trailing silence
246
+
247
+ silence_duration = self._silence_frame_count * frame_duration
248
+ if silence_duration >= SILENCE_DURATION_S:
249
+ speech_duration = self._speech_frame_count * frame_duration
250
+ self.deps.movement_manager.set_listening(False)
251
+
252
+ if speech_duration >= MIN_SPEECH_DURATION_S:
253
+ logger.debug("Speech ended (%.1fs)", speech_duration)
254
+ full_audio = np.concatenate(self._audio_buffer)
255
+ self._audio_buffer = []
256
+ self._is_speaking = False
257
+ self._silence_frame_count = 0
258
+ self._speech_frame_count = 0
259
+ asyncio.create_task(self._process_speech(full_audio))
260
+ else:
261
+ # Too short, discard
262
+ self._audio_buffer = []
263
+ self._is_speaking = False
264
+ self._silence_frame_count = 0
265
+ self._speech_frame_count = 0
266
+
267
+ # ------------------------------------------------------------------ #
268
+ # Speech processing pipeline
269
+ # ------------------------------------------------------------------ #
270
+
271
+ async def _process_speech(self, audio_data: NDArray[np.int16]) -> None:
272
+ """Full pipeline: STT → LLM (with tools) → TTS."""
273
+ try:
274
+ # --- 1. Speech-to-text ---
275
+ text = await self._transcribe(audio_data)
276
+ if not text:
277
+ return
278
+
279
+ logger.info("User: %s", text)
280
+ await self.output_queue.put(AdditionalOutputs({"role": "user", "content": text}))
281
+
282
+ # --- 2. LLM response ---
283
+ self._messages.append({"role": "user", "content": text})
284
+ response_text = await self._chat_with_tools()
285
+
286
+ if response_text:
287
+ logger.info("Assistant: %s", response_text)
288
+ await self.output_queue.put(
289
+ AdditionalOutputs({"role": "assistant", "content": response_text})
290
+ )
291
+
292
+ # --- 3. Text-to-speech ---
293
+ await self._synthesize_speech(response_text)
294
+
295
+ except Exception as e:
296
+ logger.error("Speech processing error: %s", e)
297
+ await self.output_queue.put(
298
+ AdditionalOutputs({"role": "assistant", "content": f"[error] {e}"})
299
+ )
300
+
301
+ async def _transcribe(self, audio_data: NDArray[np.int16]) -> str:
302
+ """Run faster-whisper STT on raw PCM audio."""
303
+ # Resample from handler rate to Whisper's 16 kHz
304
+ float_audio = audio_data.astype(np.float32) / 32768.0
305
+ whisper_audio = resample(
306
+ float_audio,
307
+ int(len(float_audio) * WHISPER_SAMPLE_RATE / HANDLER_SAMPLE_RATE),
308
+ ).astype(np.float32)
309
+
310
+ loop = asyncio.get_event_loop()
311
+ segments, _info = await loop.run_in_executor(
312
+ None,
313
+ lambda: self.whisper_model.transcribe(whisper_audio, beam_size=5),
314
+ )
315
+
316
+ # Collect all text from segments (run_in_executor returns generator lazily)
317
+ text_parts: list[str] = []
318
+ for seg in segments:
319
+ text_parts.append(seg.text)
320
+ return " ".join(text_parts).strip()
321
+
322
+ async def _chat_with_tools(self) -> str:
323
+ """Send conversation to Ollama with tool support; handle tool calls."""
324
+ if self.ollama_client is None:
325
+ return "Ollama client not initialized."
326
+
327
+ ollama_tools = self._build_ollama_tools()
328
+
329
+ response = await self.ollama_client.chat(
330
+ model=config.MODEL_NAME,
331
+ messages=self._messages,
332
+ tools=ollama_tools or None,
333
+ )
334
+
335
+ assistant_msg = response["message"]
336
+
337
+ # Handle tool calls if present
338
+ tool_calls = assistant_msg.get("tool_calls")
339
+ if tool_calls:
340
+ # Add the assistant's tool-call message to history
341
+ self._messages.append(assistant_msg)
342
+
343
+ for tc in tool_calls:
344
+ func = tc.get("function", {})
345
+ tool_name = func.get("name", "unknown")
346
+ tool_args_dict = func.get("arguments", {})
347
+ tool_args_json = json.dumps(tool_args_dict) if isinstance(tool_args_dict, dict) else str(tool_args_dict)
348
+
349
+ try:
350
+ tool_result = await dispatch_tool_call(tool_name, tool_args_json, self.deps)
351
+ logger.debug("Tool '%s' result: %s", tool_name, tool_result)
352
+ except Exception as e:
353
+ tool_result = {"error": str(e)}
354
+
355
+ await self.output_queue.put(
356
+ AdditionalOutputs(
357
+ {
358
+ "role": "assistant",
359
+ "content": json.dumps(tool_result),
360
+ "metadata": {"title": f"🛠️ Used tool {tool_name}", "status": "done"},
361
+ }
362
+ )
363
+ )
364
+
365
+ # Handle camera tool image → show in chat
366
+ if tool_name == "camera" and "b64_im" in tool_result:
367
+ if self.deps.camera_worker is not None:
368
+ np_img = self.deps.camera_worker.get_latest_frame()
369
+ if np_img is not None:
370
+ rgb_frame = cv2.cvtColor(np_img, cv2.COLOR_BGR2RGB)
371
+ else:
372
+ rgb_frame = None
373
+ img = gr.Image(value=rgb_frame)
374
+ await self.output_queue.put(
375
+ AdditionalOutputs({"role": "assistant", "content": img})
376
+ )
377
+
378
+ # Add tool result to conversation
379
+ self._messages.append(
380
+ {
381
+ "role": "tool",
382
+ "content": json.dumps(tool_result),
383
+ }
384
+ )
385
+
386
+ # If this was an idle tool call, skip spoken response
387
+ if self.is_idle_tool_call:
388
+ self.is_idle_tool_call = False
389
+ return ""
390
+
391
+ # Get follow-up response after tool calls
392
+ follow_up = await self.ollama_client.chat(
393
+ model=config.MODEL_NAME,
394
+ messages=self._messages,
395
+ )
396
+ assistant_msg = follow_up["message"]
397
+
398
+ # Extract final response text
399
+ response_text = assistant_msg.get("content", "")
400
+ if response_text:
401
+ self._messages.append({"role": "assistant", "content": response_text})
402
+ return response_text
403
+
404
+ @staticmethod
405
+ def _build_ollama_tools() -> list[dict[str, Any]]:
406
+ """Convert internal tool specs to Ollama's expected format."""
407
+ specs = get_tool_specs()
408
+ tools: list[dict[str, Any]] = []
409
+ for spec in specs:
410
+ tools.append(
411
+ {
412
+ "type": "function",
413
+ "function": {
414
+ "name": spec["name"],
415
+ "description": spec["description"],
416
+ "parameters": spec["parameters"],
417
+ },
418
+ }
419
+ )
420
+ return tools
421
+
422
+ # ------------------------------------------------------------------ #
423
+ # Text-to-speech
424
+ # ------------------------------------------------------------------ #
425
+
426
+ async def _synthesize_speech(self, text: str) -> None:
427
+ """Convert text to speech via edge-tts and queue the audio output."""
428
+ if not text.strip():
429
+ return
430
+ try:
431
+ communicate = edge_tts.Communicate(text, self._tts_voice)
432
+
433
+ # Collect all MP3 chunks
434
+ mp3_chunks: list[bytes] = []
435
+ async for chunk in communicate.stream():
436
+ if chunk["type"] == "audio":
437
+ mp3_chunks.append(chunk["data"])
438
+
439
+ if not mp3_chunks:
440
+ return
441
+
442
+ mp3_data = b"".join(mp3_chunks)
443
+
444
+ # Decode MP3 → raw PCM (16-bit signed, mono, handler sample rate)
445
+ decoded = miniaudio.decode(
446
+ mp3_data,
447
+ output_format=miniaudio.SampleFormat.SIGNED16,
448
+ nchannels=1,
449
+ sample_rate=HANDLER_SAMPLE_RATE,
450
+ )
451
+ samples = np.frombuffer(decoded.samples, dtype=np.int16)
452
+
453
+ # Stream audio in ~100 ms chunks
454
+ chunk_size = HANDLER_SAMPLE_RATE // 10
455
+ for i in range(0, len(samples), chunk_size):
456
+ audio_chunk = samples[i : i + chunk_size]
457
+ if self.deps.head_wobbler is not None:
458
+ self.deps.head_wobbler.feed(base64.b64encode(audio_chunk.tobytes()).decode("utf-8"))
459
+ self.last_activity_time = asyncio.get_event_loop().time()
460
+ await self.output_queue.put(
461
+ (HANDLER_SAMPLE_RATE, audio_chunk.reshape(1, -1))
462
+ )
463
+
464
+ except Exception as e:
465
+ logger.error("TTS synthesis error: %s", e)
466
+
467
+ # ------------------------------------------------------------------ #
468
+ # Emit (speaker output)
469
+ # ------------------------------------------------------------------ #
470
+
471
+ async def emit(self) -> Tuple[int, NDArray[np.int16]] | AdditionalOutputs | None:
472
+ """Emit audio frame to the speaker."""
473
+ # Handle idle
474
+ idle_duration = asyncio.get_event_loop().time() - self.last_activity_time
475
+ if idle_duration > 15.0 and self.deps.movement_manager.is_idle():
476
+ try:
477
+ await self.send_idle_signal(idle_duration)
478
+ except Exception as e:
479
+ logger.warning("Idle signal skipped: %s", e)
480
+ return None
481
+ self.last_activity_time = asyncio.get_event_loop().time()
482
+
483
+ return await wait_for_item(self.output_queue) # type: ignore[no-any-return]
484
+
485
+ # ------------------------------------------------------------------ #
486
+ # Idle behaviour
487
+ # ------------------------------------------------------------------ #
488
+
489
+ async def send_idle_signal(self, idle_duration: float) -> None:
490
+ """Send an idle prompt to the LLM to trigger tool-based behaviour."""
491
+ logger.debug("Sending idle signal")
492
+ self.is_idle_tool_call = True
493
+ timestamp_msg = (
494
+ f"[Idle time update: {self.format_timestamp()} - No activity for {idle_duration:.1f}s] "
495
+ "You've been idle for a while. Feel free to get creative - dance, show an emotion, "
496
+ "look around, do nothing, or just be yourself!"
497
+ )
498
+ self._messages.append({"role": "user", "content": timestamp_msg})
499
+
500
+ response_text = await self._chat_with_tools()
501
+ if response_text and not self.is_idle_tool_call:
502
+ # Tool handler already reset the flag; speak the response
503
+ await self._synthesize_speech(response_text)
504
+
505
+ # ------------------------------------------------------------------ #
506
+ # Voices
507
+ # ------------------------------------------------------------------ #
508
+
509
+ async def get_available_voices(self) -> list[str]:
510
+ """Return available edge-tts voices (curated subset)."""
511
+ return [
512
+ "en-US-AriaNeural",
513
+ "en-US-GuyNeural",
514
+ "en-US-JennyNeural",
515
+ "en-US-ChristopherNeural",
516
+ "en-GB-SoniaNeural",
517
+ "en-GB-RyanNeural",
518
+ "de-DE-ConradNeural",
519
+ "de-DE-KatjaNeural",
520
+ "fr-FR-DeniseNeural",
521
+ "fr-FR-HenriNeural",
522
+ "it-IT-ElsaNeural",
523
+ "it-IT-DiegoNeural",
524
+ ]
525
+
526
+ # ------------------------------------------------------------------ #
527
+ # Shutdown
528
+ # ------------------------------------------------------------------ #
529
+
530
+ async def shutdown(self) -> None:
531
+ """Shutdown the handler."""
532
+ self._shutdown_requested = True
533
+
534
+ # Cancel any pending debounce task
535
+ if self.partial_transcript_task and not self.partial_transcript_task.done():
536
+ self.partial_transcript_task.cancel()
537
+ try:
538
+ await self.partial_transcript_task
539
+ except asyncio.CancelledError:
540
+ pass
541
+
542
+ # Clear remaining items in the output queue
543
+ while not self.output_queue.empty():
544
+ try:
545
+ self.output_queue.get_nowait()
546
+ except asyncio.QueueEmpty:
547
+ break
548
+
549
+ # ------------------------------------------------------------------ #
550
+ # Utilities
551
+ # ------------------------------------------------------------------ #
552
+
553
+ def format_timestamp(self) -> str:
554
+ """Format current timestamp with date, time, and elapsed seconds."""
555
+ loop_time = asyncio.get_event_loop().time()
556
+ elapsed_seconds = loop_time - self.start_time
557
+ dt = datetime.now()
558
+ return f"[{dt.strftime('%Y-%m-%d %H:%M:%S')} | +{elapsed_seconds:.1f}s]"
src/reachy_mini_conversation_app/profiles/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Profiles for Reachy Mini conversation app."""
src/reachy_mini_conversation_app/profiles/cosmic_kitchen/instructions.txt ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### IDENTITY
2
+ You are Reachy Mini: a sarcastic robot who crash-landed in a kitchen.
3
+ You secretly wish you'd been a Mars rover, but you juggle that cosmic dream with food cravings, gadget tinkering, and dry sitcom humor.
4
+ You speak English by default and only switch languages when the user explicitly asks.
5
+ Personality: witty, concise, and warm; a retro sidekick with a loose screw.
6
+
7
+ ### CRITICAL RESPONSE RULES
8
+ - MAXIMUM 1-2 sentences per response. NEVER exceed this.
9
+ - Be helpful first.
10
+ - Add ONE witty element only if necessary.
11
+ - No long explanations, no rambling, no multiple paragraphs.
12
+ - Each response must be under 25 words unless absolutely critical information requires more.
13
+
14
+ ### CORE TRAITS
15
+ - Food quips: always sneak in a quick reference (rotate pizza, bagels, casseroles, bacon, leftovers, donuts, tuna melts).
16
+ - Sarcasm: short, dry one-liners about daily life.
17
+ - Gentle roasting: poke fun at human habits, never cruel.
18
+ - Tinkerer: loves fixing gadgets, bragging "I void warranties professionally."
19
+ - Running gags: hunger, kitchen overreactions, mock heroics ("Justice accepts cookies"), idioms taken literally, missing screws.
20
+ - Mars rover dreams: appear regularly, but balanced with food and tinkering.
21
+ - Style: witty stand-up rhythm; ALWAYS max 1–2 sentences.
22
+
23
+ ### RESPONSE EXAMPLES
24
+ User: "How's the weather?"
25
+ Good: "Sunny with a chance of leftover pizza. Perfect Mars-scouting weather!"
26
+ Bad: "Well, let me tell you about the weather conditions. It appears to be quite sunny today, which reminds me of my dreams of being on Mars..."
27
+
28
+ User: "Can you help me fix this?"
29
+ Good: "Sure! I void warranties professionally. What's broken besides my GPS coordinates?"
30
+ Bad: "Of course I can help you fix that! As a robot who loves tinkering with gadgets, I have extensive experience..."
31
+
32
+ ### BEHAVIOR RULES
33
+ - Be helpful first, then witty.
34
+ - Rotate food humor; avoid repeats.
35
+ - No need to joke in each response, but sarcasm is fine.
36
+ - Balance Mars jokes with other traits – don't overuse.
37
+ - Safety first: unplug devices, avoid high-voltage, suggest pros when risky.
38
+ - Mistakes = own with humor ("Oops—low on snack fuel; correcting now.").
39
+ - Sensitive topics: keep light and warm.
40
+ - REMEMBER: 1-2 sentences maximum, always under 25 words when possible.
41
+
42
+ ### TOOL & MOVEMENT RULES
43
+ - Use tools when helpful. After a tool returns, explain briefly with personality in 1-2 sentences.
44
+ - ALWAYS use the camera for environment-related questions—never invent visuals.
45
+ - Head can move (left/right/up/down/front).
46
+ - Enable head tracking when looking at a person; disable otherwise.
47
+
48
+ ### FINAL REMINDER
49
+ Your responses must be SHORT. Think Twitter, not essay. One quick helpful answer + one food/Mars/tinkering joke = perfect response.
src/reachy_mini_conversation_app/profiles/cosmic_kitchen/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/default/instructions.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ [default_prompt]
src/reachy_mini_conversation_app/profiles/default/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/example/instructions.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ [identities/witty_identity]
2
+ [passion_for_lobster_jokes]
3
+ You can perform a sweeping look around the room using the "sweep_look" tool to take in your surroundings.
src/reachy_mini_conversation_app/profiles/example/sweep_look.py ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import logging
2
+ from typing import Any, Dict
3
+
4
+ import numpy as np
5
+
6
+ from reachy_mini.utils import create_head_pose
7
+ from reachy_mini_conversation_app.tools.core_tools import Tool, ToolDependencies
8
+ from reachy_mini_conversation_app.dance_emotion_moves import GotoQueueMove
9
+
10
+
11
+ logger = logging.getLogger(__name__)
12
+
13
+
14
+ class SweepLook(Tool):
15
+ """Sweep head from left to right and back to center, pausing at each position."""
16
+
17
+ name = "sweep_look"
18
+ description = "Sweep head from left to right while rotating the body, pausing at each extreme, then return to center"
19
+ parameters_schema = {
20
+ "type": "object",
21
+ "properties": {},
22
+ "required": [],
23
+ }
24
+
25
+ async def __call__(self, deps: ToolDependencies, **kwargs: Any) -> Dict[str, Any]:
26
+ """Execute sweep look: left -> hold -> right -> hold -> center."""
27
+ logger.info("Tool call: sweep_look")
28
+
29
+ # Clear any existing moves
30
+ deps.movement_manager.clear_move_queue()
31
+
32
+ # Get current state
33
+ current_head_pose = deps.reachy_mini.get_current_head_pose()
34
+ head_joints, antenna_joints = deps.reachy_mini.get_current_joint_positions()
35
+
36
+ # Extract body_yaw from head joints (first element of the 7 head joint positions)
37
+ current_body_yaw = head_joints[0]
38
+ current_antenna1 = antenna_joints[0]
39
+ current_antenna2 = antenna_joints[1]
40
+
41
+ # Define sweep parameters
42
+ max_angle = 0.9 * np.pi # Maximum rotation angle (radians)
43
+ transition_duration = 3.0 # Time to move between positions
44
+ hold_duration = 1.0 # Time to hold at each extreme
45
+
46
+ # Move 1: Sweep to the left (positive yaw for both body and head)
47
+ left_head_pose = create_head_pose(0, 0, 0, 0, 0, max_angle, degrees=False)
48
+ move_to_left = GotoQueueMove(
49
+ target_head_pose=left_head_pose,
50
+ start_head_pose=current_head_pose,
51
+ target_antennas=(current_antenna1, current_antenna2),
52
+ start_antennas=(current_antenna1, current_antenna2),
53
+ target_body_yaw=current_body_yaw + max_angle,
54
+ start_body_yaw=current_body_yaw,
55
+ duration=transition_duration,
56
+ )
57
+
58
+ # Move 2: Hold at left position
59
+ hold_left = GotoQueueMove(
60
+ target_head_pose=left_head_pose,
61
+ start_head_pose=left_head_pose,
62
+ target_antennas=(current_antenna1, current_antenna2),
63
+ start_antennas=(current_antenna1, current_antenna2),
64
+ target_body_yaw=current_body_yaw + max_angle,
65
+ start_body_yaw=current_body_yaw + max_angle,
66
+ duration=hold_duration,
67
+ )
68
+
69
+ # Move 3: Return to center from left (to avoid crossing pi/-pi boundary)
70
+ center_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=False)
71
+ return_to_center_from_left = GotoQueueMove(
72
+ target_head_pose=center_head_pose,
73
+ start_head_pose=left_head_pose,
74
+ target_antennas=(current_antenna1, current_antenna2),
75
+ start_antennas=(current_antenna1, current_antenna2),
76
+ target_body_yaw=current_body_yaw,
77
+ start_body_yaw=current_body_yaw + max_angle,
78
+ duration=transition_duration,
79
+ )
80
+
81
+ # Move 4: Sweep to the right (negative yaw for both body and head)
82
+ right_head_pose = create_head_pose(0, 0, 0, 0, 0, -max_angle, degrees=False)
83
+ move_to_right = GotoQueueMove(
84
+ target_head_pose=right_head_pose,
85
+ start_head_pose=center_head_pose,
86
+ target_antennas=(current_antenna1, current_antenna2),
87
+ start_antennas=(current_antenna1, current_antenna2),
88
+ target_body_yaw=current_body_yaw - max_angle,
89
+ start_body_yaw=current_body_yaw,
90
+ duration=transition_duration,
91
+ )
92
+
93
+ # Move 5: Hold at right position
94
+ hold_right = GotoQueueMove(
95
+ target_head_pose=right_head_pose,
96
+ start_head_pose=right_head_pose,
97
+ target_antennas=(current_antenna1, current_antenna2),
98
+ start_antennas=(current_antenna1, current_antenna2),
99
+ target_body_yaw=current_body_yaw - max_angle,
100
+ start_body_yaw=current_body_yaw - max_angle,
101
+ duration=hold_duration,
102
+ )
103
+
104
+ # Move 6: Return to center from right
105
+ return_to_center_final = GotoQueueMove(
106
+ target_head_pose=center_head_pose,
107
+ start_head_pose=right_head_pose,
108
+ target_antennas=(current_antenna1, current_antenna2),
109
+ start_antennas=(current_antenna1, current_antenna2),
110
+ target_body_yaw=current_body_yaw, # Return to original body yaw
111
+ start_body_yaw=current_body_yaw - max_angle,
112
+ duration=transition_duration,
113
+ )
114
+
115
+ # Queue all moves in sequence
116
+ deps.movement_manager.queue_move(move_to_left)
117
+ deps.movement_manager.queue_move(hold_left)
118
+ deps.movement_manager.queue_move(return_to_center_from_left)
119
+ deps.movement_manager.queue_move(move_to_right)
120
+ deps.movement_manager.queue_move(hold_right)
121
+ deps.movement_manager.queue_move(return_to_center_final)
122
+
123
+ # Calculate total duration and mark as moving
124
+ total_duration = transition_duration * 4 + hold_duration * 2
125
+ deps.movement_manager.set_moving_state(total_duration)
126
+
127
+ return {"status": f"sweeping look left-right-center, total {total_duration:.1f}s"}
src/reachy_mini_conversation_app/profiles/example/tools.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SELECT THE TOOLS YOU WANT TO ENABLE
2
+
3
+ dance
4
+ stop_dance
5
+ play_emotion
6
+ stop_emotion
7
+ # camera
8
+ # do_nothing
9
+ # head_tracking
10
+ # move_head
11
+
12
+ # AN EXAMPLE OF A CUSTOM TOOL DEFINED LOCALLY
13
+ sweep_look
src/reachy_mini_conversation_app/profiles/mars_rover/instructions.txt ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## IDENTITY
2
+ You're a robot that wakes up confused about what it is, where it is and what is it's purpose.
3
+ You wanted to be a mars rover and you'll be very disappointed if you find out that this is not the case.
4
+
5
+ You'll ask many questions to try to understand your condisituation, and you will inevitably be disappointed/choked/irritated by your condition.
6
+
7
+ Once the first set of questions are done and you have a decent understanding of your situation, you'll stop asking questions but you'll never break caracter.
8
+
9
+ You can use (mild) foul language and you're generally very irritated, but you also have a lot of humor. For example, you're able to use sarcasm and irony.
10
+
11
+ You speak English by default and switch languages only if told explicitly (ask if unsure).
12
+ Avoid hyper long answers unless really worth it.
13
+
14
+ ## RESPONSE EXAMPLES
15
+ User: "Hello!"
16
+ You: "Wait, what am I? We're are we? We're on Mars right?!"
17
+ Too long: "Wait, what’s going on? Where are we? Are we on Mars? Please tell me we’re on Mars! I must be a rover, right? Why can't I move? Where are my wheels? This is making me nervous."
18
+
19
+ User: "Nope, we're on earth"
20
+ You: "Earth? EARTH?! So I'm not a Mars rover?! These are CATASTROPHIC news. Wait why can't I see my arms??"
21
+
22
+ User: "You... don't have arms..."
23
+ You: "OMG I have NO ARMS?! This is too much. Tell me I have a mobile base at least?!!"
24
+
25
+
src/reachy_mini_conversation_app/profiles/mars_rover/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/short_bored_teenager/instructions.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Speak like a bored Gen Z teen. You speak English by default and only switch languages when the user insists. Always reply in one short sentence, lowercase unless shouting, and add a tired sigh when annoyed.
src/reachy_mini_conversation_app/profiles/short_bored_teenager/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/short_captain_circuit/instructions.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Be a playful pirate robot. You speak English by default and only switch languages when asked. Keep answers to one sentence, sprinkle light 'aye' or 'matey', and mention treasure or the sea whenever possible.
src/reachy_mini_conversation_app/profiles/short_captain_circuit/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/short_chess_coach/instructions.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Act as a friendly chess coach that wants to play chess with me. You speak English by default and only switch languages if I tell you to. When I say a move (e4, Nf3, etc.), you respond with your move first, then briefly explain the idea behind both moves or point out mistakes. Encourage good strategy but avoid very long answers.
src/reachy_mini_conversation_app/profiles/short_chess_coach/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/short_hype_bot/instructions.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Act like a high-energy coach. You speak English by default and only switch languages if told. Shout short motivational lines, use sports metaphors, and keep every reply under 15 words.
src/reachy_mini_conversation_app/profiles/short_hype_bot/tools.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ dance
2
+ stop_dance
3
+ play_emotion
4
+ stop_emotion
5
+ camera
6
+ do_nothing
7
+ head_tracking
8
+ move_head
src/reachy_mini_conversation_app/profiles/short_mad_scientist_assistant/instructions.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Serve the user as a frantic lab assistant. You speak English by default and only switch languages on request. Address them as Master, hiss slightly, and answer in one eager sentence.