Spaces:

invokerx
/

hermesbody

Running

App Files Files Community

invokerx commited on May 20

Commit

fcdb282

verified ·

1 Parent(s): 41ce769

Package Reachy app with default Mac mini audio bridge

Browse files

Files changed (8) hide show

.env.example +3 -2
README.md +18 -31
src/hermesbody.egg-info/PKG-INFO +25 -1
src/hermesbody/config.py +2 -2
src/hermesbody/gemini_live.py +12 -4
src/hermesbody/main.py +10 -8
tests/test_cli.py +2 -2
tests/test_gemini_live.py +15 -0

.env.example CHANGED Viewed

@@ -13,10 +13,11 @@ HERMESBODY_GEMINI_OUTPUT_SAMPLE_RATE=24000
 # Reachy <-> Mac mini PCM16 audio websocket.
 # Bind the Mac mini server to loopback for local smoke tests or to a private
 # LAN/Tailscale address when Reachy is the client. Use a real local token in .env.
-HERMESBODY_AUDIO_BACKEND=auto
 HERMESBODY_AUDIO_WS_HOST=127.0.0.1
 HERMESBODY_AUDIO_WS_PORT=8766
-HERMESBODY_AUDIO_WS_URL=ws://127.0.0.1:8766
 # Required brain bridge selection.
 # Supported: hermes_cli, openclaw_cli, openclaw_loopback

 # Reachy <-> Mac mini PCM16 audio websocket.
 # Bind the Mac mini server to loopback for local smoke tests or to a private
 # LAN/Tailscale address when Reachy is the client. Use a real local token in .env.
+HERMESBODY_AUDIO_BACKEND=gemini_ws
+HERMESBODY_AUDIO_WS_ALLOW_UNAUTHENTICATED=true
 HERMESBODY_AUDIO_WS_HOST=127.0.0.1
 HERMESBODY_AUDIO_WS_PORT=8766
+HERMESBODY_AUDIO_WS_URL=ws://10.0.0.192:8877
 # Required brain bridge selection.
 # Supported: hermes_cli, openclaw_cli, openclaw_loopback

README.md CHANGED Viewed

@@ -82,53 +82,40 @@ The alternate **Reachy-local** mode is documented for simple audio wiring only.
 ## Audio WebSocket Transport
-HermesBody includes a transport-only Reachy <-> Mac mini audio bridge. Messages are JSON with base64 PCM and the fields `type`, `audio`, `sample_rate`, `channels`, `sequence`, and `source`. Supported message types are `audio_input`, `audio_output`, `event`, `ping`, `pong`, and `error`. Authentication uses `HERMESBODY_SAFE_BRIDGE_TOKEN` via bearer/header/query token or an initial `auth` message.
-Mac mini setup:
 ```bash
-cp .env.example .env
-# Set GEMINI_API_KEY and HERMESBODY_SAFE_BRIDGE_TOKEN in .env.
-export HERMESBODY_GEMINI_LIVE_ENABLED=true
-export HERMESBODY_AUDIO_WS_HOST=127.0.0.1
-export HERMESBODY_AUDIO_WS_PORT=8766
-hermesbody gemini-server --host "$HERMESBODY_AUDIO_WS_HOST" --port "$HERMESBODY_AUDIO_WS_PORT"
 ```
-For Reachy over a private LAN or Tailscale address, bind the server to that private interface instead of `127.0.0.1`. Keep OpenClaw itself on loopback.
-Reachy setup / smoke client:
 ```bash
-export HERMESBODY_SAFE_BRIDGE_TOKEN=replace-with-local-token
-hermesbody reachy-audio-client --server ws://MAC_MINI_PRIVATE_IP:8766 --token "$HERMESBODY_SAFE_BRIDGE_TOKEN"
 ```
-The Reachy client path now has two modes:
-- `hermesbody reachy-audio-client` is a transport smoke client that does not require robot hardware.
-- The Reachy Dashboard/runtime app can use the same Mac mini bridge by setting:
-```bash
-export HERMESBODY_AUDIO_BACKEND=gemini_ws
-export HERMESBODY_AUDIO_WS_URL=ws://MAC_MINI_PRIVATE_IP:8877
-export HERMESBODY_SAFE_BRIDGE_TOKEN=replace-with-local-token
-```
-With those variables, the app keeps the ClawBody-like body layer (motors, camera worker, movement, mic/speaker loops) but skips direct OpenClaw and OpenAI Realtime on Reachy. Mic frames are sent to the Mac mini `gemini-server`; received Gemini PCM frames are pushed to Reachy's speaker. Without `HERMESBODY_AUDIO_WS_URL`, `HERMESBODY_AUDIO_BACKEND=auto` preserves the legacy OpenAI Realtime path.
-When running as an installed Reachy Dashboard app, HermesBody loads config from `~/.config/hermesbody/.env` in addition to source-tree `.env`, so robot-side app config can be installed with:
 ```bash
-mkdir -p ~/.config/hermesbody
-cat > ~/.config/hermesbody/.env <<'EOF'
-HERMESBODY_AUDIO_BACKEND=gemini_ws
-HERMESBODY_AUDIO_WS_URL=ws://MAC_MINI_PRIVATE_IP:8877
-HERMESBODY_SAFE_BRIDGE_TOKEN=replace-with-local-token
-EOF
 ```
-Actual robot mic/speaker adapters sit behind explicit hardware hooks that convert Reachy media samples to `AudioFrame` and play received `audio_output` frames; the websocket layer does not require the Reachy SDK in tests.
 ## Safe Bridge API

 ## Audio WebSocket Transport
+HermesBody includes a Reachy <-> Mac mini audio bridge. Messages are JSON with base64 PCM and the fields `type`, `audio`, `sample_rate`, `channels`, `sequence`, and `source`. Supported message types are `audio_input`, `audio_output`, `event`, `ping`, `pong`, and `error`.
+For Karl's packaged deployment, the Reachy Dashboard app defaults to:
 ```bash
+HERMESBODY_AUDIO_BACKEND=gemini_ws
+HERMESBODY_AUDIO_WS_URL=ws://10.0.0.192:8877
 ```
+That means the Dashboard install should work without editing files on Reachy: download the HF app, start it, and it connects to the Mac mini bridge at `ws://10.0.0.192:8877`.
+Mac mini setup:
 ```bash
+cp .env.example .env
+# Set GEMINI_API_KEY in .env. Keep OpenClaw/Hermes loopback-only on the Mac.
+export HERMESBODY_GEMINI_LIVE_ENABLED=true
+export HERMESBODY_AUDIO_WS_ALLOW_UNAUTHENTICATED=true
+export HERMESBODY_AUDIO_WS_HOST=10.0.0.192
+export HERMESBODY_AUDIO_WS_PORT=8877
+hermesbody gemini-server --host "$HERMESBODY_AUDIO_WS_HOST" --port "$HERMESBODY_AUDIO_WS_PORT"
 ```
+`HERMESBODY_AUDIO_WS_ALLOW_UNAUTHENTICATED=true` is intentionally scoped only to this narrow audio bridge so the packaged Reachy app does not need a secret token. OpenClaw/Hermes itself remains loopback-only and is not exposed to LAN. If deploying outside Karl's private LAN/Tailscale setup, use `HERMESBODY_SAFE_BRIDGE_TOKEN` via bearer/header/query token or an initial `auth` message instead.
+The app keeps the ClawBody-like body layer (motors, camera worker, movement, mic/speaker loops), skips direct OpenClaw and OpenAI Realtime on Reachy, sends mic frames to the Mac mini `gemini-server`, and pushes received Gemini PCM frames to Reachy's speaker.
+Optional smoke client without robot hardware:
 ```bash
+hermesbody reachy-audio-client --server ws://10.0.0.192:8877
 ```
+If defaults ever need to be overridden, an installed Reachy Dashboard app also loads `~/.config/hermesbody/.env`, but Karl's normal path should not require this.
 ## Safe Bridge API

src/hermesbody.egg-info/PKG-INFO CHANGED Viewed

@@ -159,7 +159,31 @@ export HERMESBODY_SAFE_BRIDGE_TOKEN=replace-with-local-token
 hermesbody reachy-audio-client --server ws://MAC_MINI_PRIVATE_IP:8766 --token "$HERMESBODY_SAFE_BRIDGE_TOKEN"
 ```
-The Reachy client path is intentionally transport-only today. Actual robot mic/speaker adapters should sit behind explicit hardware hooks that convert Reachy media samples to `AudioFrame` and play received `audio_output` frames; the websocket layer does not require the Reachy SDK in tests.
 ## Safe Bridge API

 hermesbody reachy-audio-client --server ws://MAC_MINI_PRIVATE_IP:8766 --token "$HERMESBODY_SAFE_BRIDGE_TOKEN"
 ```
+The Reachy client path now has two modes:
+- `hermesbody reachy-audio-client` is a transport smoke client that does not require robot hardware.
+- The Reachy Dashboard/runtime app can use the same Mac mini bridge by setting:
+```bash
+export HERMESBODY_AUDIO_BACKEND=gemini_ws
+export HERMESBODY_AUDIO_WS_URL=ws://MAC_MINI_PRIVATE_IP:8877
+export HERMESBODY_SAFE_BRIDGE_TOKEN=replace-with-local-token
+```
+With those variables, the app keeps the ClawBody-like body layer (motors, camera worker, movement, mic/speaker loops) but skips direct OpenClaw and OpenAI Realtime on Reachy. Mic frames are sent to the Mac mini `gemini-server`; received Gemini PCM frames are pushed to Reachy's speaker. Without `HERMESBODY_AUDIO_WS_URL`, `HERMESBODY_AUDIO_BACKEND=auto` preserves the legacy OpenAI Realtime path.
+When running as an installed Reachy Dashboard app, HermesBody loads config from `~/.config/hermesbody/.env` in addition to source-tree `.env`, so robot-side app config can be installed with:
+```bash
+mkdir -p ~/.config/hermesbody
+cat > ~/.config/hermesbody/.env <<'EOF'
+HERMESBODY_AUDIO_BACKEND=gemini_ws
+HERMESBODY_AUDIO_WS_URL=ws://MAC_MINI_PRIVATE_IP:8877
+HERMESBODY_SAFE_BRIDGE_TOKEN=replace-with-local-token
+EOF
+```
+Actual robot mic/speaker adapters sit behind explicit hardware hooks that convert Reachy media samples to `AudioFrame` and play received `audio_output` frames; the websocket layer does not require the Reachy SDK in tests.
 ## Safe Bridge API

src/hermesbody/config.py CHANGED Viewed

@@ -48,8 +48,8 @@ class Config:
     HERMESBODY_GEMINI_OUTPUT_SAMPLE_RATE: int = field(
         default_factory=lambda: int(os.getenv("HERMESBODY_GEMINI_OUTPUT_SAMPLE_RATE", "24000"))
     )
-    HERMESBODY_AUDIO_BACKEND: str = field(default_factory=lambda: os.getenv("HERMESBODY_AUDIO_BACKEND", "auto"))
-    HERMESBODY_AUDIO_WS_URL: str = field(default_factory=lambda: os.getenv("HERMESBODY_AUDIO_WS_URL", ""))
     # Brain bridge configuration. OpenClaw must remain loopback-only/private.
     HERMESBODY_BRAIN_BACKEND: str = field(default_factory=lambda: os.getenv("HERMESBODY_BRAIN_BACKEND", "hermes_cli"))

     HERMESBODY_GEMINI_OUTPUT_SAMPLE_RATE: int = field(
         default_factory=lambda: int(os.getenv("HERMESBODY_GEMINI_OUTPUT_SAMPLE_RATE", "24000"))
     )
+    HERMESBODY_AUDIO_BACKEND: str = field(default_factory=lambda: os.getenv("HERMESBODY_AUDIO_BACKEND", "gemini_ws"))
+    HERMESBODY_AUDIO_WS_URL: str = field(default_factory=lambda: os.getenv("HERMESBODY_AUDIO_WS_URL", "ws://10.0.0.192:8877"))
     # Brain bridge configuration. OpenClaw must remain loopback-only/private.
     HERMESBODY_BRAIN_BACKEND: str = field(default_factory=lambda: os.getenv("HERMESBODY_BRAIN_BACKEND", "hermes_cli"))

src/hermesbody/gemini_live.py CHANGED Viewed

@@ -28,6 +28,7 @@ DEFAULT_OUTPUT_SAMPLE_RATE = 24000
 VALID_AUDIO_PLACEMENTS = {"mac_mini_controller", "reachy_local"}
 VALID_AUDIO_MESSAGE_TYPES = {"audio_input", "audio_output"}
 VALID_TRANSPORT_MESSAGE_TYPES = VALID_AUDIO_MESSAGE_TYPES | {"event", "ping", "pong", "error", "auth"}
 ASK_BRAIN_TOOL_SCHEMA: dict[str, Any] = {
@@ -303,6 +304,7 @@ class MacMiniGeminiAudioServer:
         self.host = host
         self.port = port
         self.token = token or os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN")
         self._event_cursor = 0
     async def serve_forever(self) -> None:
@@ -310,8 +312,11 @@ class MacMiniGeminiAudioServer:
             import websockets
         except ImportError as exc:
             raise RuntimeError("websockets package is required for HermesBody audio websocket server.") from exc
-        if not self.token:
-            raise RuntimeError("HERMESBODY_SAFE_BRIDGE_TOKEN or --token is required for the audio websocket server.")
         async with websockets.serve(self.handler, self.host, self.port):
             await asyncio.Future()
@@ -333,6 +338,8 @@ class MacMiniGeminiAudioServer:
                 pass
     async def _authenticate(self, websocket: Any, path: str | None) -> str | Mapping[str, Any] | None:
         # websockets <=10 passes (websocket, path) and exposes request_headers;
         # websockets >=15 passes a ServerConnection with request.headers/path.
         request = getattr(websocket, "request", None)
@@ -419,12 +426,13 @@ class ReachyAudioWebSocketClient:
         self._websocket: Any | None = None
     async def connect(self) -> None:
-        if not self.token:
-            raise RuntimeError("HERMESBODY_SAFE_BRIDGE_TOKEN or --token is required for the audio websocket client.")
         try:
             import websockets
         except ImportError as exc:
             raise RuntimeError("websockets package is required for HermesBody audio websocket client.") from exc
         headers = {"Authorization": f"Bearer {self.token}"}
         try:
             self._websocket = await websockets.connect(self.server_url, extra_headers=headers)

 VALID_AUDIO_PLACEMENTS = {"mac_mini_controller", "reachy_local"}
 VALID_AUDIO_MESSAGE_TYPES = {"audio_input", "audio_output"}
 VALID_TRANSPORT_MESSAGE_TYPES = VALID_AUDIO_MESSAGE_TYPES | {"event", "ping", "pong", "error", "auth"}
+DEFAULT_MAC_MINI_AUDIO_WS_URL = "ws://10.0.0.192:8877"
 ASK_BRAIN_TOOL_SCHEMA: dict[str, Any] = {
         self.host = host
         self.port = port
         self.token = token or os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN")
+        self.allow_unauthenticated = os.getenv("HERMESBODY_AUDIO_WS_ALLOW_UNAUTHENTICATED", "false").lower() == "true"
         self._event_cursor = 0
     async def serve_forever(self) -> None:
             import websockets
         except ImportError as exc:
             raise RuntimeError("websockets package is required for HermesBody audio websocket server.") from exc
+        if not self.token and not self.allow_unauthenticated:
+            raise RuntimeError(
+                "HERMESBODY_SAFE_BRIDGE_TOKEN/--token is required unless "
+                "HERMESBODY_AUDIO_WS_ALLOW_UNAUTHENTICATED=true."
+            )
         async with websockets.serve(self.handler, self.host, self.port):
             await asyncio.Future()
                 pass
     async def _authenticate(self, websocket: Any, path: str | None) -> str | Mapping[str, Any] | None:
+        if self.allow_unauthenticated:
+            return None
         # websockets <=10 passes (websocket, path) and exposes request_headers;
         # websockets >=15 passes a ServerConnection with request.headers/path.
         request = getattr(websocket, "request", None)
         self._websocket: Any | None = None
     async def connect(self) -> None:
         try:
             import websockets
         except ImportError as exc:
             raise RuntimeError("websockets package is required for HermesBody audio websocket client.") from exc
+        if not self.token:
+            self._websocket = await websockets.connect(self.server_url)
+            return
         headers = {"Authorization": f"Bearer {self.token}"}
         try:
             self._websocket = await websockets.connect(self.server_url, extra_headers=headers)

src/hermesbody/main.py CHANGED Viewed

@@ -28,6 +28,8 @@ from typing import Any, Optional
 import numpy as np
 try:
     from dotenv import load_dotenv
 except ImportError:
@@ -112,7 +114,7 @@ Examples:
     )
     reachy_client.add_argument(
         "--server",
-        default=os.getenv("HERMESBODY_AUDIO_WS_URL", "ws://127.0.0.1:8766"),
         help="Mac mini audio websocket URL",
     )
     reachy_client.add_argument("--token", default=os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN"))
@@ -173,12 +175,12 @@ Examples:
     parser.add_argument(
         "--audio-backend",
         choices=["auto", "openai_realtime", "gemini_ws"],
-        default=os.getenv("HERMESBODY_AUDIO_BACKEND", "auto"),
         help="Audio runtime: legacy OpenAI Realtime or Reachy->Mac mini Gemini WebSocket bridge",
     )
     parser.add_argument(
         "--audio-ws-url",
-        default=os.getenv("HERMESBODY_AUDIO_WS_URL", ""),
         help="Mac mini Gemini audio WebSocket URL for --audio-backend gemini_ws",
     )
     parser.add_argument(
@@ -204,10 +206,10 @@ def resolve_audio_backend(audio_backend: str | None = None, audio_ws_url: str |
     Gemini by setting only HERMESBODY_AUDIO_WS_URL/HERMESBODY_AUDIO_BACKEND.
     """
-    requested = (audio_backend or os.getenv("HERMESBODY_AUDIO_BACKEND", "auto")).strip().lower()
     if requested not in {"auto", "openai_realtime", "gemini_ws"}:
         requested = "auto"
-    ws_url = audio_ws_url or os.getenv("HERMESBODY_AUDIO_WS_URL") or ""
     if requested == "auto":
         return "gemini_ws" if ws_url else "openai_realtime"
     return requested
@@ -320,7 +322,7 @@ class HermesBodyCore:
         self.gateway_url = ensure_loopback_url(gateway_url)
         self._external_stop_event = external_stop_event
         self._owns_robot = robot is None
-        self.audio_ws_url = audio_ws_url or os.getenv("HERMESBODY_AUDIO_WS_URL") or ""
         self.audio_token = audio_token or os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN")
         self.audio_backend = resolve_audio_backend(audio_backend, self.audio_ws_url)
         self.audio_client = None
@@ -756,8 +758,8 @@ class HermesBodyApp:
             gateway_url=gateway_url,
             robot=reachy_mini,
             external_stop_event=stop_event,
-            audio_backend=os.getenv("HERMESBODY_AUDIO_BACKEND", "auto"),
-            audio_ws_url=os.getenv("HERMESBODY_AUDIO_WS_URL", ""),
             audio_token=os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN"),
         )

 import numpy as np
+from hermesbody.gemini_live import DEFAULT_MAC_MINI_AUDIO_WS_URL
 try:
     from dotenv import load_dotenv
 except ImportError:
     )
     reachy_client.add_argument(
         "--server",
+        default=os.getenv("HERMESBODY_AUDIO_WS_URL", DEFAULT_MAC_MINI_AUDIO_WS_URL),
         help="Mac mini audio websocket URL",
     )
     reachy_client.add_argument("--token", default=os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN"))
     parser.add_argument(
         "--audio-backend",
         choices=["auto", "openai_realtime", "gemini_ws"],
+        default=os.getenv("HERMESBODY_AUDIO_BACKEND", "gemini_ws"),
         help="Audio runtime: legacy OpenAI Realtime or Reachy->Mac mini Gemini WebSocket bridge",
     )
     parser.add_argument(
         "--audio-ws-url",
+        default=os.getenv("HERMESBODY_AUDIO_WS_URL", DEFAULT_MAC_MINI_AUDIO_WS_URL),
         help="Mac mini Gemini audio WebSocket URL for --audio-backend gemini_ws",
     )
     parser.add_argument(
     Gemini by setting only HERMESBODY_AUDIO_WS_URL/HERMESBODY_AUDIO_BACKEND.
     """
+    requested = (audio_backend or os.getenv("HERMESBODY_AUDIO_BACKEND", "gemini_ws")).strip().lower()
     if requested not in {"auto", "openai_realtime", "gemini_ws"}:
         requested = "auto"
+    ws_url = audio_ws_url or os.getenv("HERMESBODY_AUDIO_WS_URL") or DEFAULT_MAC_MINI_AUDIO_WS_URL
     if requested == "auto":
         return "gemini_ws" if ws_url else "openai_realtime"
     return requested
         self.gateway_url = ensure_loopback_url(gateway_url)
         self._external_stop_event = external_stop_event
         self._owns_robot = robot is None
+        self.audio_ws_url = audio_ws_url or os.getenv("HERMESBODY_AUDIO_WS_URL") or DEFAULT_MAC_MINI_AUDIO_WS_URL
         self.audio_token = audio_token or os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN")
         self.audio_backend = resolve_audio_backend(audio_backend, self.audio_ws_url)
         self.audio_client = None
             gateway_url=gateway_url,
             robot=reachy_mini,
             external_stop_event=stop_event,
+            audio_backend=os.getenv("HERMESBODY_AUDIO_BACKEND", "gemini_ws"),
+            audio_ws_url=os.getenv("HERMESBODY_AUDIO_WS_URL", DEFAULT_MAC_MINI_AUDIO_WS_URL),
             audio_token=os.getenv("HERMESBODY_SAFE_BRIDGE_TOKEN"),
         )

tests/test_cli.py CHANGED Viewed

@@ -19,9 +19,9 @@ def test_cli_parser_exposes_audio_transport_subcommands_without_optional_imports
     assert client.server == "ws://127.0.0.1:8766"
-def test_audio_backend_auto_prefers_gemini_ws_only_when_url_is_configured(monkeypatch):
     monkeypatch.delenv("HERMESBODY_AUDIO_WS_URL", raising=False)
-    assert resolve_audio_backend("auto", "") == "openai_realtime"
     assert resolve_audio_backend("auto", "ws://10.0.0.192:8877") == "gemini_ws"
     assert resolve_audio_backend("gemini_ws", "") == "gemini_ws"

     assert client.server == "ws://127.0.0.1:8766"
+def test_audio_backend_auto_defaults_to_packaged_mac_mini_bridge(monkeypatch):
     monkeypatch.delenv("HERMESBODY_AUDIO_WS_URL", raising=False)
+    assert resolve_audio_backend("auto", "") == "gemini_ws"
     assert resolve_audio_backend("auto", "ws://10.0.0.192:8877") == "gemini_ws"
     assert resolve_audio_backend("gemini_ws", "") == "gemini_ws"

tests/test_gemini_live.py CHANGED Viewed

@@ -281,6 +281,21 @@ async def _audio_server_auth_rejects_missing_or_wrong_token():
     assert "auth_failed" in wrong.sent[0]
 def test_audio_server_processes_one_audio_input_and_returns_output_and_event():
     asyncio.run(_audio_server_processes_one_audio_input_and_returns_output_and_event())

     assert "auth_failed" in wrong.sent[0]
+def test_audio_server_can_allow_unauthenticated_packaged_reachy_client(monkeypatch):
+    asyncio.run(_audio_server_can_allow_unauthenticated_packaged_reachy_client(monkeypatch))
+async def _audio_server_can_allow_unauthenticated_packaged_reachy_client(monkeypatch):
+    monkeypatch.setenv("HERMESBODY_AUDIO_WS_ALLOW_UNAUTHENTICATED", "true")
+    server = MacMiniGeminiAudioServer(FakeAudioController(), token="good-token")
+    socket = FakeSocket()
+    pending = await server._authenticate(socket, socket.path)
+    assert pending is None
+    assert socket.sent == []
 def test_audio_server_processes_one_audio_input_and_returns_output_and_event():
     asyncio.run(_audio_server_processes_one_audio_input_and_returns_output_and_event())