tao-shen Claude Opus 4.6 commited on
Commit
66d172d
·
1 Parent(s): 082f8f4

security: change AUTO_CREATE_DATASET default to false

Browse files

Default is now false for security — users must create the Dataset repo
manually or explicitly set AUTO_CREATE_DATASET=true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (3) hide show
  1. .env.example +5 -4
  2. README.md +1 -1
  3. scripts/sync_hf.py +1 -1
.env.example CHANGED
@@ -50,11 +50,12 @@ HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
50
  OPENCLAW_DATASET_REPO=your-username/HuggingClaw-data
51
 
52
  # Whether to auto-create the Dataset repo if it doesn't exist.
53
- # Set to false if you prefer to create the repo manually on HuggingFace.
 
54
  #
55
- # [OPTIONAL] Default: true
56
  #
57
- # AUTO_CREATE_DATASET=true
58
 
59
  # How often (in seconds) to back up data to the Dataset repo.
60
  # Lower values = safer but more API calls to HuggingFace.
@@ -172,7 +173,7 @@ OPENROUTER_API_KEY=sk-or-v1-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
172
  # ─── 持久化 (HuggingFace Dataset) ───────────────────────────────────────
173
  # HF_TOKEN [必填] HF 访问令牌,需具备写入权限
174
  # OPENCLAW_DATASET_REPO [必填] 备份用 Dataset 仓库,如 your-name/HuggingClaw-data
175
- # AUTO_CREATE_DATASET [可选] 是否自动创建仓库,默认 true
176
  # SYNC_INTERVAL [可选] 备份间隔(秒),默认 60
177
  # HF_HUB_DOWNLOAD_TIMEOUT [可选] 下载超时(秒),默认 300
178
  # HF_HUB_UPLOAD_TIMEOUT [可选] 上传超时(秒),默认 600
 
50
  OPENCLAW_DATASET_REPO=your-username/HuggingClaw-data
51
 
52
  # Whether to auto-create the Dataset repo if it doesn't exist.
53
+ # Set to true to let HuggingClaw create it automatically on first startup.
54
+ # Default is false for security — you must create the repo manually first.
55
  #
56
+ # [OPTIONAL] Default: false
57
  #
58
+ # AUTO_CREATE_DATASET=false
59
 
60
  # How often (in seconds) to back up data to the Dataset repo.
61
  # Lower values = safer but more API calls to HuggingFace.
 
173
  # ─── 持久化 (HuggingFace Dataset) ───────────────────────────────────────
174
  # HF_TOKEN [必填] HF 访问令牌,需具备写入权限
175
  # OPENCLAW_DATASET_REPO [必填] 备份用 Dataset 仓库,如 your-name/HuggingClaw-data
176
+ # AUTO_CREATE_DATASET [可选] 是否自动创建仓库,默认 false(安全考虑)
177
  # SYNC_INTERVAL [可选] 备份间隔(秒),默认 60
178
  # HF_HUB_DOWNLOAD_TIMEOUT [可选] 下载超时(秒),默认 300
179
  # HF_HUB_UPLOAD_TIMEOUT [可选] 上传超时(秒),默认 600
README.md CHANGED
@@ -93,7 +93,7 @@ In addition to the secrets above, HuggingClaw provides environment variables to
93
 
94
  | Variable | Default | Description |
95
  |----------|---------|-------------|
96
- | `AUTO_CREATE_DATASET` | `true` | **Auto-create the Dataset repo** if it doesn't exist. When set to `true`, HuggingClaw automatically creates a **private** HuggingFace Dataset repo (using the name from `OPENCLAW_DATASET_REPO`) on first startup. Set to `false` if you prefer to [create the repo manually](https://huggingface.co/new-dataset) before deploying. Accepted values: `true`, `1`, `yes` (enabled) / `false`, `0`, `no` (disabled). |
97
  | `SYNC_INTERVAL` | `60` | **Backup interval in seconds.** How often HuggingClaw syncs the `~/.openclaw` directory (conversations, settings, credentials) to the HuggingFace Dataset repo. Lower values mean less data loss on restart but more API calls. Recommended: `60`–`300`. |
98
  | `NODE_MEMORY_LIMIT` | `512` | **Node.js heap memory limit in MB.** HF free tier provides 16 GB RAM; the default 512 MB is enough for most cases. Increase if you run complex agent workflows or handle very large conversations. |
99
  | `TZ` | `UTC` | **Timezone** for log timestamps and scheduled tasks. Example: `Asia/Shanghai`, `America/New_York`. |
 
93
 
94
  | Variable | Default | Description |
95
  |----------|---------|-------------|
96
+ | `AUTO_CREATE_DATASET` | `false` | **Auto-create the Dataset repo** if it doesn't exist. Default is `false` for security — you must [create the repo manually](https://huggingface.co/new-dataset) first. Set to `true` to let HuggingClaw automatically create a **private** Dataset repo (using the name from `OPENCLAW_DATASET_REPO`) on first startup. Accepted values: `true`, `1`, `yes` (enabled) / `false`, `0`, `no` (disabled). |
97
  | `SYNC_INTERVAL` | `60` | **Backup interval in seconds.** How often HuggingClaw syncs the `~/.openclaw` directory (conversations, settings, credentials) to the HuggingFace Dataset repo. Lower values mean less data loss on restart but more API calls. Recommended: `60`–`300`. |
98
  | `NODE_MEMORY_LIMIT` | `512` | **Node.js heap memory limit in MB.** HF free tier provides 16 GB RAM; the default 512 MB is enough for most cases. Increase if you run complex agent workflows or handle very large conversations. |
99
  | `TZ` | `UTC` | **Timezone** for log timestamps and scheduled tasks. Example: `Asia/Shanghai`, `America/New_York`. |
scripts/sync_hf.py CHANGED
@@ -77,7 +77,7 @@ SPACE_HOST = os.environ.get("SPACE_HOST", "") # e.g. "tao-shen-huggingclaw.hf.
77
  SPACE_ID = os.environ.get("SPACE_ID", "") # e.g. "tao-shen/HuggingClaw"
78
 
79
  SYNC_INTERVAL = int(os.environ.get("SYNC_INTERVAL", "60"))
80
- AUTO_CREATE_DATASET = os.environ.get("AUTO_CREATE_DATASET", "true").lower() in ("true", "1", "yes")
81
 
82
  # Setup logging
83
  log_dir = OPENCLAW_HOME / "workspace"
 
77
  SPACE_ID = os.environ.get("SPACE_ID", "") # e.g. "tao-shen/HuggingClaw"
78
 
79
  SYNC_INTERVAL = int(os.environ.get("SYNC_INTERVAL", "60"))
80
+ AUTO_CREATE_DATASET = os.environ.get("AUTO_CREATE_DATASET", "false").lower() in ("true", "1", "yes")
81
 
82
  # Setup logging
83
  log_dir = OPENCLAW_HOME / "workspace"