File size: 16,247 Bytes
f04c254
3a5cf48
 
 
 
f04c254
3a5cf48
 
 
 
f04c254
 
 
3a5cf48
 
 
 
 
 
 
 
 
 
 
 
 
e178b46
 
 
 
 
3a5cf48
 
 
 
 
 
e178b46
 
 
3a5cf48
e178b46
 
3a5cf48
e178b46
 
 
 
3a5cf48
 
e178b46
 
 
 
 
 
 
 
 
 
 
 
 
 
3a5cf48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e178b46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3a5cf48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e178b46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3a5cf48
 
e178b46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3a5cf48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
---
title: fe
emoji: 🦞
colorFrom: blue
colorTo: indigo
sdk: docker
sdk_version: 29.0.4
python_version: 3.14.4
app_port: 7860
app_file: mian.py
pinned: false
---

# OpenClaw on Hugging Face Space (Docker)

> **Languages:** [English](./README.md) · [简体中文](./README_zh.md)
> **Deployment Guide:** [DEPLOY_GUIDE.md](./DEPLOY_GUIDE.md) | [中文部署指南](./DEPLOY_GUIDE_zh.md)

This setup is designed to provide the following:

- Build the OpenClaw container on top of `ubuntu:24.04`
- Serve the OpenClaw dashboard directly on port `7860` (default Space access port)
- Use third-party OpenAI-compatible `base_url + api_key` by default (injected via environment variables)
- Store OpenClaw config/workspace under `/root/.openclaw`
- Restore state automatically from a Hugging Face Dataset on startup
- Run scheduled backups of OpenClaw data to a Hugging Face Dataset via `cron` (as `root` user)
- Incremental backup + dynamic strategy + AES-256-CBC encryption + large file splitting
- Backup watchdog (auto-triggers backup when cron fails)
- SSH service with auto-healing watchdog + host key generation
- CCMR (Claude Code Model Router) with 10 platform API key support
- Multi-dataset restore (restore from a different dataset)
- Preinstall `python3`, `uv`, `vim`, `neovim`, `chromium` (via Chrome for Testing archive), `gh`, `hf`, `opencode`, `codex`, `claude` (Claude Code CLI), `@larksuite/cli` (with `npx skills add larksuite/cli -y -g`), and `sshx` in the image for interactive terminal use

## Repository Layout

- `Dockerfile`: Runtime image for the Space
- `scripts/openclaw-entrypoint.sh`: Main startup flow (restore, config generation, cron setup, gateway start)
- `scripts/hf-entrypoint.sh`: HF Spaces container entrypoint (PID 1, manages supervisord + SSH + PM2 + BT Panel)
- `scripts/supervisord.conf`: Supervisord config, manages cron, backup-watchdog, openclaw-gateway, ccmr-gateway
- `openclaw_hf/backup.py`: Backup/restore implementation (full/incremental, encryption, split, dynamic strategy, resume)
- `scripts/openclaw-backup-cron.sh`: Cron entrypoint for backup jobs
- `scripts/openclaw-backup-watchdog.sh`: Backup watchdog, auto-triggers backup when overdue
- `scripts/openclaw-backup-health.sh`: Backup health check & auto-repair
- `scripts/openclaw-restore.sh`: Startup restore entrypoint
- `scripts/openclaw-gateway-ctl`: Gateway process management (start/stop/restart/reload)
- `scripts/openclaw-env-sync.sh`: Sync environment variables from HF API
- `scripts/update-env-from-secrets.sh`: Fetch latest env vars from HF API
- `scripts/bt_install_panel_custom.sh`: BT Panel installation script
- `scripts/bootstrap-hf.sh`: Interactive bootstrap for Space/Dataset creation, upload, and Space variables/secrets setup (macOS/Linux)
- `scripts/bootstrap-hf.ps1`: Interactive bootstrap for Space/Dataset creation, upload, and Space variables/secrets setup (Windows PowerShell)
- `scripts/rebuild-space.sh`: Force push latest code to Space and trigger rebuild
- `scripts/delete-backups.sh`: Batch cleanup old backups from Dataset
- `scripts/delete-hf.py`: HF resource deletion tool (Space/Dataset/files/storage)
- `scripts/find-largest-backup.py`: Find best backup in Dataset
- `scripts/ssh_service_watchdog.sh`: SSH service watchdog (process monitor + auto-recovery)
- `scripts/check_ssh_health.sh`: SSH health check (used by Docker HEALTHCHECK)
- `scripts/ssh-agent-autostart.sh`: SSH agent auto-start and key loading
- `scripts/optimize_ssh.sh`: SSH configuration optimization
- `scripts/save-env.sh`: Save environment to `/etc/profile.d`
- `scripts/hf-storage.sh` / `scripts/hf-storage.py`: HuggingFace storage utilities
- `scripts/ccmr-setup.sh`: CCMR configuration generation
- `scripts/ccmr-wrapper.sh`: CCMR Supervisor wrapper (hot-reload + crash recovery)
- `scripts/server.js`: PID 1 keep-alive HTTP server
- `pm2/ecosystem.config.js`: PM2 configuration (optional extension)
- `tests/test_backup.py`: Unit tests for the backup module
- `tests/test_entrypoint_config.py`: Unit tests for gateway config generation behavior

## Required Variables (Space Settings)

In your Hugging Face Space (`Settings -> Variables and secrets`), configure at least:

- Variable: `OPENCLAW_BACKUP_DATASET_REPO`: Backup target Dataset in `username/dataset-name` format
- Secret: `HF_TOKEN`: Used to write backups to the Dataset (must have write permission to that Dataset)
- Secret: `OPENCLAW_GATEWAY_TOKEN`: Gateway token (recommended; if omitted in deployment workflow, generate a random 32-character value)
- Secret: `OPENCLAW_GATEWAY_PASSWORD`: Gateway password (optional; if omitted in deployment workflow, generate a random 16-character value)

When using `./scripts/bootstrap-hf.sh` (macOS/Linux) or `./scripts/bootstrap-hf.ps1` (Windows PowerShell), these values are configured automatically on the target Space.

## Optional LLM Variables (All-Or-None)

Set all of these together only when you want OpenClaw to preconfigure a custom third-party model:

- Variable: `OPENCLAW_LLM_BASE_URL`: Third-party base URL (for example OpenAI-compatible `/v1`)
- Variable: `OPENCLAW_LLM_MODEL`: Third-party model ID
- Secret: `OPENCLAW_LLM_API_KEY`: Third-party API key

If any of the three is missing, entrypoint skips custom model generation.
In that case, you can still configure from inside the container (for example via `sshx`).

## Common Optional Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `OPENCLAW_VERSION` | `latest` | OpenClaw version for Docker install |
| `OPENCLAW_GATEWAY_PORT` | `18789` | Gateway listen port |
| `OPENCLAW_GATEWAY_BIND` | `lan` | Gateway bind mode (`lan`/`local`) |
| `OPENCLAW_STATE_DIR` | `/root/.openclaw` | OpenClaw state directory |
| `OPENCLAW_USER` | `root` | Runtime user for gateway and cron |
| `OPENCLAW_GROUP` | `root` | Runtime group |
| `OPENCLAW_CONFIG_PATH` | `/root/.openclaw/openclaw.json` | Gateway config path |
| `OPENCLAW_WORKSPACE_DIR` | `/root/.openclaw/workspace` | Workspace directory |
| `OPENCLAW_BACKUP_CRON` | `*/10 * * * *` | Backup cron expression |
| `OPENCLAW_BACKUP_SOURCE_DIR` | `/root/.openclaw` | Backup/restore base directory |
| `OPENCLAW_BACKUP_ROOT_*_DIR` | Various | Extra backup dirs (config, codex, claude, agents, ssh, env, npm, lark-cli) |
| `OPENCLAW_BACKUP_PATH_PREFIX` | `backups` | Backup path prefix |
| `OPENCLAW_BACKUP_KEEP_COUNT` | `24` | Number of backups to keep |
| `OPENCLAW_BACKUP_ENCRYPTION_ENABLED` | `false` | Enable AES-256-CBC encryption |
| `OPENCLAW_BACKUP_SPLIT_SIZE` | `500M` | Large file split volume size |
| `OPENCLAW_INCREMENTAL_BACKUP` | `true` | Enable incremental backup |
| `OPENCLAW_DYNAMIC_BACKUP` | `true` | Enable dynamic backup strategy |
| `OPENCLAW_FULL_BACKUP_INTERVAL_HOURS` | `1` | Force full backup interval |
| `OPENCLAW_MAX_INCREMENTAL_BACKUPS` | `15` | Max incremental backups before full |
| `OPENCLAW_RESTORE_TIMEOUT` | `5400` | Restore timeout (seconds, 90 min) |
| `WATCHDOG_INTERVAL` | `600` | Backup watchdog check interval (s) |
| `MAX_BACKUP_AGE_MINUTES` | `30` | Max backup age (minutes) |
| `FORCE_BACKUP_INTERVAL` | `14400` | Force backup interval (seconds) |
| `OPENCLAW_SSHX_AUTO_START` | `false` | Auto-start `sshx` on boot |
| `OPENCLAW_GATEWAY_AUTH_MODE` | `token` | Auth mode (`token`/`password`) |
| `ROOT_PASSWORD` | `lauer3912` | SSH root password |
| `CCMR_ENABLED` | `false` | Enable Claude Code Model Router |
| `CCMR_PORT` | `8080` | CCMR gateway port |

## Quick Deployment

Run the interactive bootstrap script from repo root:

```bash
./scripts/bootstrap-hf.sh
```

```powershell
powershell -ExecutionPolicy ByPass -File .\scripts\bootstrap-hf.ps1
```

`bootstrap-hf.sh` / `bootstrap-hf.ps1` will:

- Check/install `hf` CLI:
  - macOS/Linux: `curl -LsSf https://hf.co/cli/install.sh | bash`
  - Windows PowerShell: `powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"`
- Resolve HF auth first (before all other variables):
  - if `hf auth whoami` is not logged in: prompt `HF_TOKEN` and run `hf auth login --token <HF_TOKEN>`
  - if already logged in: ask whether to use current user
    - choose `yes`: continue
    - choose `no`: backup current token, prompt new `HF_TOKEN`, run `hf auth login --token <HF_TOKEN>`, and restore the previous token at the end
- Ask for `space_name`, `dataset_name`, `OPENCLAW_VERSION`, gateway token/password, and optional LLM settings
- Default `OPENCLAW_VERSION` to latest detected from npm registry (`openclaw`), fallback `latest` when detection fails
- Auto-generate `OPENCLAW_GATEWAY_TOKEN` (32 chars) and `OPENCLAW_GATEWAY_PASSWORD` (16 chars) if left empty
- Create private Space + Dataset and upload this repository
- Configure Space `Variables and secrets` automatically, including:
  - `OPENCLAW_BACKUP_DATASET_REPO`
  - `OPENCLAW_VERSION`
  - `HF_TOKEN`
  - `OPENCLAW_GATEWAY_TOKEN`
  - `OPENCLAW_GATEWAY_PASSWORD`
  - `OPENCLAW_GATEWAY_CONTROLUI_ALLOW_INSECURE_AUTH=false`
  - `OPENCLAW_GATEWAY_CONTROLUI_DANGEROUSLY_DISABLE_DEVICE_AUTH=false`
- Optionally configure LLM triplet and set `OPENCLAW_SSHX_AUTO_START` from prompt choice (`true`/`false`)
- Print planned deployment settings and require a final confirmation before creating/updating Space/Dataset resources
- Print Hugging Face Space page URL, app URL, and `/healthz`

If gateway token/password were auto-generated, the script prints them at the end.

## Agent Hand-off Prompt

Copy and send to your agent:

```
Please deploy OpenClaw to Hugging Face by strictly following the deployment skill in https://github.com/tenfyzhong/openclaw-hf/blob/main/SKILL.md
```

## Hugging Face Keep-Alive

How to keep a Space available depends on hardware tier:

- Free `cpu-basic`: the Space sleeps after inactivity (currently around 48h). It cannot be configured to run forever on free hardware.
- Paid hardware: the Space runs continuously by default. In `Settings -> Hardware`, set `Sleep time` to `Never` (or use API with `sleep_time=-1`) for true 24/7 availability.
- Cost-saving mode on paid hardware: set a custom `Sleep time` (for example `3600` seconds) so it auto-sleeps and auto-wakes on the next visit.

Space URL composition:

- Space repo ID format: `<owner>/<space_name>` (example: `tenfyzhong/openclaw-hf`)
- Public runtime host format: `https://<owner>-<space_name>.hf.space`
- OpenClaw health check URL: `https://<owner>-<space_name>.hf.space/healthz`
- Inside the Space runtime, Hugging Face also provides `SPACE_HOST`, so health URL can be built as `https://${SPACE_HOST}/healthz`.

Example:

```bash
OPENCLAW_HF_SPACE_ID="tenfyzhong/openclaw-hf"
SPACE_HOST="${OPENCLAW_HF_SPACE_ID/\//-}.hf.space"
HEALTH_URL="https://${SPACE_HOST}/healthz"
echo "$HEALTH_URL"
```

Keep-alive by periodic health checks:

```bash
*/12 * * * * HF_TOKEN=hf_xxx /path/to/repo/scripts/check-space-health.sh tenfyzhong/openclaw-hf >/dev/null || true
```

Notes:

- For private Spaces, unauthenticated calls to `https://<owner>-<space_name>.hf.space/healthz` return a Hub 404 page. This is expected access control behavior.
- For private Spaces, include `Authorization: Bearer <HF_TOKEN>` (the helper script above does this automatically via `HF_TOKEN` or `HUGGINGFACE_HUB_TOKEN`).
- This ping strategy is a practical workaround for reducing idle sleep on free hardware, but it is not a guaranteed always-on method.
- If you need strict 24/7 uptime, use paid hardware and set sleep time to `Never`.

References:

- <https://huggingface.co/docs/hub/spaces-gpus#sleep-time>
- <https://huggingface.co/docs/huggingface_hub/package_reference/space_runtime>
- <https://huggingface.co/docs/hub/spaces-overview>

Programmatic options (owner token required):

```python
from huggingface_hub import HfApi

api = HfApi(token="hf_xxx")
repo_id = "your-username/your-space"

# Keep running (paid hardware)
api.set_space_sleep_time(repo_id=repo_id, sleep_time=-1)

# Or sleep after 1 hour of inactivity
api.set_space_sleep_time(repo_id=repo_id, sleep_time=3600)

# Manual control
api.pause_space(repo_id=repo_id)
api.restart_space(repo_id=repo_id)
```

For this project, if you need stable dashboard access without cold starts, use paid hardware and set sleep time to `Never`.

## SSH Service

The container has a comprehensive SSH service guarding system to ensure continuous availability:

- **Auto-start**: Entrypoint generates host keys, cleans stale PID files, starts sshd
- **SSH Watchdog** (`ssh_service_watchdog.sh`): Monitors sshd every 30s, auto-recovers on failure
- **Multi-level repair**: Config corruption → backup config → minimal config → auto-reinstall openssh-server
- **Exponential backoff**: Gradually increases wait time on consecutive failures
- **Health check** (`check_ssh_health.sh`): Used by Docker HEALTHCHECK
- **SSH Agent auto-load**: Auto-starts ssh-agent and loads keys from `/root/.ssh/`
- **Root password**: Set via `ROOT_PASSWORD` environment variable

## CCMR (Claude Code Model Router)

CCMR gateway is integrated and managed by Supervisord with hot-reload support:

- **Auto-config**: Set `CCMR_*_API_KEY` env vars to enable
- **10 API Key slots**: DeepSeek, Qwen, Kimi, GLM, MiniMax (CN/Global), MiMo (SGP/CN/AMS/PAYG)
- **File hot-reload**: Edit `/root/.env.d/ccmr.env` and changes apply immediately without restart
- **Crash recovery**: Supervisord auto-restarts CCMR process

## Backup/Restore Flow

### Restore

**Automatic restore on startup** (always runs on container restart/rebuild):

- `openclaw-state` -> `OPENCLAW_BACKUP_SOURCE_DIR` (default `/root/.openclaw`)
- `root-config` -> `OPENCLAW_BACKUP_ROOT_CONFIG_DIR` (default `/root/.config`)
- `root-codex` -> `OPENCLAW_BACKUP_ROOT_CODEX_DIR` (default `/root/.codex`)
- `root-claude` -> `OPENCLAW_BACKUP_ROOT_CLAUDE_DIR` (default `/root/.claude`)
- `root-agents` -> `OPENCLAW_BACKUP_ROOT_AGENTS_DIR` (default `/root/.agents`)
- `root-ssh` -> `OPENCLAW_BACKUP_ROOT_SSH_DIR` (default `/root/.ssh`)
- `root-env` -> `OPENCLAW_BACKUP_ROOT_ENV_DIR` (default `/root/.env.d`)
- `root-npm` -> `OPENCLAW_BACKUP_ROOT_NPM_DIR` (default `/root/.npm`)
- `root-lark-cli` -> `OPENCLAW_BACKUP_ROOT_LARK_CLI_DIR` (default `/root/.lark-cli`)

Multi-dataset restore: set `OPENCLAW_RESTORE_DATASET_REPO` to restore from a different dataset.

### Backup

- **Scheduled backup**: Runs based on `OPENCLAW_BACKUP_CRON` (default every 10 min)
- **Incremental backup** (default on): Only backs up changed files after a full backup
- **Dynamic strategy** (default on): Auto-adjusts compression and splitting based on file size and change rate
- **AES-256-CBC encryption**: Optional, allows secure storage on public datasets
- **Large file splitting**: Default 500MB per volume, avoids upload failures
- **Resume support**: Creates checkpoint files during upload, allows resume on interruption
- **Shutdown backup**: Final backup before container exit on stop signal
- **Retention**: Keeps newest `OPENCLAW_BACKUP_KEEP_COUNT` (default 24) archives, auto-deletes older ones

### Backup Watchdog

`openclaw-backup-watchdog.sh` acts as the **last line of defense**:

- Auto-triggers backup when no backup for `MAX_BACKUP_AGE_MINUTES` (default 30 min)
- Force backup every `FORCE_BACKUP_INTERVAL` (default 4 hours)
- File lock prevents concurrent execution
- Automatic backoff on consecutive failures

## Use sshx Inside the Container

`sshx` is preinstalled in the image.

1. Auto-start `sshx` in background via environment variables:

```bash
OPENCLAW_SSHX_AUTO_START=true
```

When enabled, entrypoint starts `sshx` in background and sends `sshx` output directly to container stdout/stderr logs (no file logging).

2. Manual start inside container:

```bash
sshx
```

3. Let OpenClaw start a process itself (run in OpenClaw terminal/tool):

```bash
nohup sshx >/proc/1/fd/1 2>/proc/1/fd/2 &
```

4. After use, close `sshx` process promptly:

```bash
pgrep -fa sshx
pkill -TERM -f '(^|/)sshx($| )'
```

## Local Test

```bash
python3 -m unittest discover -s tests -p 'test_*.py'
```

Pull Requests to `main` run GitHub Actions CI automatically (`.github/workflows/pr-ci.yml`):
- Unit tests: `python3 -m unittest discover -s tests -p 'test_*.py'`
- Docker image build: `docker build` (via Buildx) with `OPENCLAW_VERSION=latest`

## License

MIT. See `LICENSE`.