Spaces:
Sleeping
Sleeping
File size: 5,481 Bytes
caea1dc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | ---
summary: "Camera capture (iOS node + macOS app) for agent use: photos (jpg) and short video clips (mp4)"
read_when:
- Adding or modifying camera capture on iOS nodes or macOS
- Extending agent-accessible MEDIA temp-file workflows
title: "Camera Capture"
---
# Camera capture (agent)
OpenClaw supports **camera capture** for agent workflows:
- **iOS node** (paired via Gateway): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `node.invoke`.
- **Android node** (paired via Gateway): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `node.invoke`.
- **macOS app** (node via Gateway): capture a **photo** (`jpg`) or **short video clip** (`mp4`, with optional audio) via `node.invoke`.
All camera access is gated behind **user-controlled settings**.
## iOS node
### User setting (default on)
- iOS Settings tab → **Camera** → **Allow Camera** (`camera.enabled`)
- Default: **on** (missing key is treated as enabled).
- When off: `camera.*` commands return `CAMERA_DISABLED`.
### Commands (via Gateway `node.invoke`)
- `camera.list`
- Response payload:
- `devices`: array of `{ id, name, position, deviceType }`
- `camera.snap`
- Params:
- `facing`: `front|back` (default: `front`)
- `maxWidth`: number (optional; default `1600` on the iOS node)
- `quality`: `0..1` (optional; default `0.9`)
- `format`: currently `jpg`
- `delayMs`: number (optional; default `0`)
- `deviceId`: string (optional; from `camera.list`)
- Response payload:
- `format: "jpg"`
- `base64: "<...>"`
- `width`, `height`
- Payload guard: photos are recompressed to keep the base64 payload under 5 MB.
- `camera.clip`
- Params:
- `facing`: `front|back` (default: `front`)
- `durationMs`: number (default `3000`, clamped to a max of `60000`)
- `includeAudio`: boolean (default `true`)
- `format`: currently `mp4`
- `deviceId`: string (optional; from `camera.list`)
- Response payload:
- `format: "mp4"`
- `base64: "<...>"`
- `durationMs`
- `hasAudio`
### Foreground requirement
Like `canvas.*`, the iOS node only allows `camera.*` commands in the **foreground**. Background invocations return `NODE_BACKGROUND_UNAVAILABLE`.
### CLI helper (temp files + MEDIA)
The easiest way to get attachments is via the CLI helper, which writes decoded media to a temp file and prints `MEDIA:<path>`.
Examples:
```bash
openclaw nodes camera snap --node <id> # default: both front + back (2 MEDIA lines)
openclaw nodes camera snap --node <id> --facing front
openclaw nodes camera clip --node <id> --duration 3000
openclaw nodes camera clip --node <id> --no-audio
```
Notes:
- `nodes camera snap` defaults to **both** facings to give the agent both views.
- Output files are temporary (in the OS temp directory) unless you build your own wrapper.
## Android node
### User setting (default on)
- Android Settings sheet → **Camera** → **Allow Camera** (`camera.enabled`)
- Default: **on** (missing key is treated as enabled).
- When off: `camera.*` commands return `CAMERA_DISABLED`.
### Permissions
- Android requires runtime permissions:
- `CAMERA` for both `camera.snap` and `camera.clip`.
- `RECORD_AUDIO` for `camera.clip` when `includeAudio=true`.
If permissions are missing, the app will prompt when possible; if denied, `camera.*` requests fail with a
`*_PERMISSION_REQUIRED` error.
### Foreground requirement
Like `canvas.*`, the Android node only allows `camera.*` commands in the **foreground**. Background invocations return `NODE_BACKGROUND_UNAVAILABLE`.
### Payload guard
Photos are recompressed to keep the base64 payload under 5 MB.
## macOS app
### User setting (default off)
The macOS companion app exposes a checkbox:
- **Settings → General → Allow Camera** (`openclaw.cameraEnabled`)
- Default: **off**
- When off: camera requests return “Camera disabled by user”.
### CLI helper (node invoke)
Use the main `openclaw` CLI to invoke camera commands on the macOS node.
Examples:
```bash
openclaw nodes camera list --node <id> # list camera ids
openclaw nodes camera snap --node <id> # prints MEDIA:<path>
openclaw nodes camera snap --node <id> --max-width 1280
openclaw nodes camera snap --node <id> --delay-ms 2000
openclaw nodes camera snap --node <id> --device-id <id>
openclaw nodes camera clip --node <id> --duration 10s # prints MEDIA:<path>
openclaw nodes camera clip --node <id> --duration-ms 3000 # prints MEDIA:<path> (legacy flag)
openclaw nodes camera clip --node <id> --device-id <id>
openclaw nodes camera clip --node <id> --no-audio
```
Notes:
- `openclaw nodes camera snap` defaults to `maxWidth=1600` unless overridden.
- On macOS, `camera.snap` waits `delayMs` (default 2000ms) after warm-up/exposure settle before capturing.
- Photo payloads are recompressed to keep base64 under 5 MB.
## Safety + practical limits
- Camera and microphone access trigger the usual OS permission prompts (and require usage strings in Info.plist).
- Video clips are capped (currently `<= 60s`) to avoid oversized node payloads (base64 overhead + message limits).
## macOS screen video (OS-level)
For _screen_ video (not camera), use the macOS companion:
```bash
openclaw nodes screen record --node <id> --duration 10s --fps 15 # prints MEDIA:<path>
```
Notes:
- Requires macOS **Screen Recording** permission (TCC).
|