tao-shen Claude Opus 4.6 commited on
Commit
3cf6b15
·
1 Parent(s): 39e2a50

feat: SSH access + stress test for Ubuntu desktop; always start sshd

Browse files

- Dockerfile: pre-generate SSH host key at build time
- start-desktop.sh: sshd always starts (key-auth or password-less);
configurable SSH_LISTEN (0.0.0.0 for local, 127.0.0.1 for HF)
- monitor_and_test.py: add --ssh-test with connect, command exec,
stress (concurrent sessions), and brute-force ramp-up tests
- scripts/test_local.sh: full local Docker integration test
- scripts/verify_overnight.sh: multi-round overnight verification

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Dockerfile CHANGED
@@ -34,6 +34,11 @@ RUN (useradd -m -u 1000 user 2>/dev/null) || \
34
  ENV HOME=/home/user
35
  RUN mkdir -p /data && chown 1000:1000 /data
36
 
 
 
 
 
 
37
  # HuggingRun scripts (build context = repo root)
38
  COPY scripts /scripts
39
  COPY ubuntu-desktop/start-desktop.sh /opt/start-desktop.sh
 
34
  ENV HOME=/home/user
35
  RUN mkdir -p /data && chown 1000:1000 /data
36
 
37
+ # Pre-generate SSH host key so sshd can start without root
38
+ RUN mkdir -p /home/user/.ssh && \
39
+ ssh-keygen -t ed25519 -f /home/user/.ssh/ssh_host_ed25519_key -N "" -C "" && \
40
+ chown -R 1000:1000 /home/user/.ssh
41
+
42
  # HuggingRun scripts (build context = repo root)
43
  COPY scripts /scripts
44
  COPY ubuntu-desktop/start-desktop.sh /opt/start-desktop.sh
Dockerfile.ubuntu-desktop ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ubuntu 24.04 Desktop on HuggingRun — noVNC on 7860, SSH on 2222, persistence via /data
2
+ FROM ubuntu:24.04
3
+
4
+ ENV DEBIAN_FRONTEND=noninteractive
5
+
6
+ # System + Python (for sync)
7
+ RUN apt-get update && apt-get install -y --no-install-recommends \
8
+ ca-certificates curl python3 python3-pip python3-venv \
9
+ && pip3 install --no-cache-dir --break-system-packages huggingface_hub \
10
+ && rm -rf /var/lib/apt/lists/*
11
+
12
+ # Desktop stack: Xvfb, XFCE, dbus, x11vnc, Firefox; OpenSSH for local/reverse SSH
13
+ RUN apt-get update && apt-get install -y --no-install-recommends \
14
+ xvfb \
15
+ xfce4 xfce4-goodies \
16
+ dbus-x11 \
17
+ x11vnc \
18
+ firefox \
19
+ procps \
20
+ openssh-server openssh-client \
21
+ && rm -rf /var/lib/apt/lists/*
22
+
23
+ # noVNC (web client on 7860)
24
+ RUN apt-get update && apt-get install -y --no-install-recommends git \
25
+ && git clone --depth 1 https://github.com/novnc/noVNC.git /opt/noVNC \
26
+ && git clone --depth 1 https://github.com/novnc/websockify /opt/noVNC/utils/websockify \
27
+ && rm -rf /var/lib/apt/lists/* /opt/noVNC/.git
28
+
29
+ # HF Spaces run as user 1000; UID 1000 may exist (e.g. ubuntu)
30
+ RUN (useradd -m -u 1000 user 2>/dev/null) || \
31
+ (EXISTING=$(getent passwd 1000 | cut -d: -f1); \
32
+ usermod -l user $EXISTING; usermod -d /home/user user; \
33
+ mkdir -p /home/user && chown 1000:1000 /home/user)
34
+ ENV HOME=/home/user
35
+ RUN mkdir -p /data && chown user:user /data
36
+
37
+ # Pre-generate SSH host key so sshd can start without root
38
+ RUN mkdir -p /home/user/.ssh && \
39
+ ssh-keygen -t ed25519 -f /home/user/.ssh/ssh_host_ed25519_key -N "" -C "" && \
40
+ chown -R 1000:1000 /home/user/.ssh
41
+
42
+ # HuggingRun scripts (build context = repo root)
43
+ COPY scripts /scripts
44
+ COPY ubuntu-desktop/start-desktop.sh /opt/start-desktop.sh
45
+ RUN chmod +x /scripts/entrypoint.sh /opt/start-desktop.sh
46
+
47
+ ENV PERSIST_PATH=/data
48
+ ENV RUN_CMD="/opt/start-desktop.sh"
49
+ ENV DESKTOP_HOME=/data/desktop-home
50
+ ENV DISPLAY=:99
51
+ ENV VNC_PORT=5901
52
+ ENV NOVNC_PORT=7860
53
+ # SSH_LISTEN: 0.0.0.0 for local Docker testing, 127.0.0.1 for HF (reverse SSH only)
54
+ ENV SSH_LISTEN=0.0.0.0
55
+ ENV SSH_PORT=2222
56
+
57
+ USER user
58
+ EXPOSE 7860 2222
59
+ ENTRYPOINT ["/scripts/entrypoint.sh"]
scripts/monitor_and_test.py CHANGED
@@ -5,11 +5,13 @@ HuggingRun: 监控远端 Space 状态并执行基础/压力/持久化验证(
5
 
6
  用法:
7
  python3 scripts/monitor_and_test.py --test
8
- HF_TOKEN=xxx python3 scripts/monitor_and_test.py --until-ok --url https://xxx.hf.space --expect noVNC # 轮询 API 直到 RUNNING 再测,失败打日志尾
9
- HF_TOKEN=xxx python3 scripts/monitor_and_test.py --wait-running --test
10
- HF_TOKEN=xxx python3 scripts/monitor_and_test.py --logs run # 流式拉取运行日志 (SSE)
11
- HF_TOKEN=xxx python3 scripts/monitor_and_test.py --logs build # 流式拉取构建日志 (SSE)
12
- 等价 curl:
 
 
13
  curl -N -H "Authorization: Bearer $HF_TOKEN" "https://huggingface.co/api/spaces/<SPACE_ID>/logs/run"
14
  curl -N -H "Authorization: Bearer $HF_TOKEN" "https://huggingface.co/api/spaces/<SPACE_ID>/logs/build"
15
  """
@@ -179,6 +181,95 @@ def test_persistence(url, rounds=3):
179
  return ok_rounds == rounds
180
 
181
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
182
  def _curl_logs_url(space_id: str, log_type: str) -> str:
183
  """Build the logs API URL (same as user's curl command)."""
184
  return f"https://huggingface.co/api/spaces/{space_id}/logs/{log_type}"
@@ -261,6 +352,18 @@ def main():
261
  help="Poll URL until 200 and body contains one of --expect (no HF_TOKEN needed)")
262
  p.add_argument("--until-ok", action="store_true",
263
  help="Poll API until RUNNING, then test; on any fail print log tail and exit 1. Loop until this exits 0.")
 
 
 
 
 
 
 
 
 
 
 
 
264
  args = p.parse_args()
265
  SPACE_ID = args.space_id
266
  APP_URL = args.url.rstrip("/")
@@ -270,6 +373,40 @@ def main():
270
  stream_logs(SPACE_ID, args.logs)
271
  return
272
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
273
  if args.until_ok:
274
  # 先立即查一次当前状态;已报错则马上用 curl 拉日志并退出,不空等
275
  if not os.environ.get("HF_TOKEN"):
@@ -334,6 +471,35 @@ def main():
334
  if not ok:
335
  sys.exit(1)
336
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
337
  if args.test:
338
  print(f"[test] Target: {APP_URL}")
339
  if not test_basic(APP_URL, expect_substrings=expect_substrings):
 
5
 
6
  用法:
7
  python3 scripts/monitor_and_test.py --test
8
+ python3 scripts/monitor_and_test.py --ssh-test --ssh-host localhost --ssh-port 2222 --ssh-user user
9
+ python3 scripts/monitor_and_test.py --ssh-test --ssh-stress-n 30 --ssh-host localhost
10
+ HF_TOKEN=xxx python3 scripts/monitor_and_test.py --watch
11
+ HF_TOKEN=xxx python3 scripts/monitor_and_test.py --until-ok --url https://xxx.hf.space --expect noVNC
12
+ HF_TOKEN=xxx python3 scripts/monitor_and_test.py --logs run
13
+ HF_TOKEN=xxx python3 scripts/monitor_and_test.py --logs build
14
+ 等价 curl(需 Bearer token):
15
  curl -N -H "Authorization: Bearer $HF_TOKEN" "https://huggingface.co/api/spaces/<SPACE_ID>/logs/run"
16
  curl -N -H "Authorization: Bearer $HF_TOKEN" "https://huggingface.co/api/spaces/<SPACE_ID>/logs/build"
17
  """
 
181
  return ok_rounds == rounds
182
 
183
 
184
+ # ── SSH Tests ────────────────────────────────────────────────────────────────
185
+
186
+ def _ssh_cmd(host, port, user, command, timeout=15, identity_file=None):
187
+ """Run a command over SSH. Returns (returncode, stdout, stderr)."""
188
+ import subprocess
189
+ cmd = [
190
+ "ssh", "-o", "StrictHostKeyChecking=no",
191
+ "-o", "UserKnownHostsFile=/dev/null",
192
+ "-o", f"ConnectTimeout={timeout}",
193
+ "-o", "LogLevel=ERROR",
194
+ "-p", str(port),
195
+ ]
196
+ if identity_file:
197
+ cmd += ["-i", identity_file]
198
+ cmd += [f"{user}@{host}", command]
199
+ try:
200
+ proc = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout + 5)
201
+ return proc.returncode, proc.stdout, proc.stderr
202
+ except subprocess.TimeoutExpired:
203
+ return -1, "", "SSH command timed out"
204
+ except Exception as e:
205
+ return -1, "", str(e)
206
+
207
+
208
+ def test_ssh_connect(host, port, user, identity_file=None):
209
+ """Test SSH connectivity: run 'echo ok' and verify output."""
210
+ rc, out, err = _ssh_cmd(host, port, user, "echo ok", identity_file=identity_file)
211
+ ok = rc == 0 and "ok" in out
212
+ print(f"[ssh-test] connect {user}@{host}:{port} -> rc={rc}, output={'ok' if ok else repr(out.strip())}")
213
+ if not ok and err:
214
+ print(f"[ssh-test] stderr: {err.strip()}")
215
+ return ok
216
+
217
+
218
+ def test_ssh_command(host, port, user, identity_file=None):
219
+ """Test SSH command execution: run several diagnostic commands."""
220
+ checks = [
221
+ ("whoami", lambda out: user in out),
222
+ ("uname -s", lambda out: "Linux" in out),
223
+ ("ls /opt/noVNC/vnc.html", lambda out: "vnc.html" in out),
224
+ ("pgrep -a Xvfb", lambda out: "Xvfb" in out),
225
+ ]
226
+ all_ok = True
227
+ for cmd, validate in checks:
228
+ rc, out, err = _ssh_cmd(host, port, user, cmd, identity_file=identity_file)
229
+ passed = rc == 0 and validate(out)
230
+ status = "PASS" if passed else "FAIL"
231
+ print(f"[ssh-test] cmd '{cmd}' -> {status} (rc={rc}, out={out.strip()[:80]})")
232
+ if not passed:
233
+ all_ok = False
234
+ return all_ok
235
+
236
+
237
+ def test_ssh_stress(host, port, user, n=30, concurrency=10, identity_file=None):
238
+ """SSH stress test: n concurrent SSH sessions each running a command."""
239
+ import concurrent.futures
240
+
241
+ def one_session(i):
242
+ rc, out, _ = _ssh_cmd(host, port, user, f"echo session-{i} && uptime",
243
+ timeout=20, identity_file=identity_file)
244
+ return rc == 0 and f"session-{i}" in out
245
+
246
+ with concurrent.futures.ThreadPoolExecutor(max_workers=concurrency) as ex:
247
+ results = list(ex.map(one_session, range(n)))
248
+ passed = sum(results)
249
+ failed = n - passed
250
+ print(f"[ssh-stress] {n} sessions (concurrency={concurrency}): {passed} ok, {failed} failed")
251
+ return failed == 0
252
+
253
+
254
+ def test_ssh_bruteforce(host, port, user, rounds=3, ramp_up=None, identity_file=None):
255
+ """Multi-round SSH stress with increasing concurrency (brute-force style)."""
256
+ if ramp_up is None:
257
+ ramp_up = [(20, 5), (40, 10), (60, 20)]
258
+ all_ok = True
259
+ for r in range(rounds):
260
+ n, conc = ramp_up[r % len(ramp_up)]
261
+ print(f"[ssh-bruteforce] Round {r+1}/{rounds}: {n} sessions, concurrency={conc}")
262
+ ok = test_ssh_stress(host, port, user, n=n, concurrency=conc, identity_file=identity_file)
263
+ if not ok:
264
+ all_ok = False
265
+ print(f"[ssh-bruteforce] Round {r+1} FAILED")
266
+ break
267
+ time.sleep(1)
268
+ if all_ok:
269
+ print(f"[ssh-bruteforce] ALL {rounds} rounds PASSED")
270
+ return all_ok
271
+
272
+
273
  def _curl_logs_url(space_id: str, log_type: str) -> str:
274
  """Build the logs API URL (same as user's curl command)."""
275
  return f"https://huggingface.co/api/spaces/{space_id}/logs/{log_type}"
 
352
  help="Poll URL until 200 and body contains one of --expect (no HF_TOKEN needed)")
353
  p.add_argument("--until-ok", action="store_true",
354
  help="Poll API until RUNNING, then test; on any fail print log tail and exit 1. Loop until this exits 0.")
355
+ p.add_argument("--watch", action="store_true",
356
+ help="Use curl to poll run (and optional build) logs + app URL every N sec; don't stop (Ctrl+C to exit)")
357
+ p.add_argument("--watch-interval", type=int, default=20, help="Seconds between --watch polls (default 20)")
358
+ # SSH test options
359
+ p.add_argument("--ssh-test", action="store_true",
360
+ help="Run SSH tests: connect + command + stress + bruteforce")
361
+ p.add_argument("--ssh-host", default="localhost", help="SSH host (default: localhost)")
362
+ p.add_argument("--ssh-port", type=int, default=2222, help="SSH port (default: 2222)")
363
+ p.add_argument("--ssh-user", default="user", help="SSH user (default: user)")
364
+ p.add_argument("--ssh-key", default=None, help="Path to SSH private key (optional)")
365
+ p.add_argument("--ssh-stress-n", type=int, default=30, help="SSH stress: total sessions (default: 30)")
366
+ p.add_argument("--ssh-concurrency", type=int, default=10, help="SSH stress: concurrent sessions (default: 10)")
367
  args = p.parse_args()
368
  SPACE_ID = args.space_id
369
  APP_URL = args.url.rstrip("/")
 
373
  stream_logs(SPACE_ID, args.logs)
374
  return
375
 
376
+ if args.watch:
377
+ # 用 curl + Bearer token 持续查看远端状态,不退出
378
+ if not os.environ.get("HF_TOKEN"):
379
+ print("HF_TOKEN required for --watch (use .env or export)", file=sys.stderr)
380
+ sys.exit(1)
381
+ import subprocess
382
+ interval = max(10, args.watch_interval)
383
+ run_url = _curl_logs_url(SPACE_ID, "run")
384
+ build_url = _curl_logs_url(SPACE_ID, "build")
385
+ token = os.environ.get("HF_TOKEN")
386
+ curl_h = ["-H", f"Authorization: Bearer {token}", "-N", "-sS", "--max-time", str(interval + 5)]
387
+ n = 0
388
+ while True:
389
+ n += 1
390
+ ts = time.strftime("%H:%M:%S", time.gmtime())
391
+ print(f"\n[watch #{n} {ts}] === runtime stage ===")
392
+ stage, _ = get_stage()
393
+ print(f"[watch] stage={stage}")
394
+ print(f"[watch] === GET {APP_URL} ===")
395
+ status, body = http_get(APP_URL, timeout=15)
396
+ print(f"[watch] HTTP {status}, body len={len(body)}, has noVNC={('noVNC' in body)}")
397
+ print(f"[watch] === run log (tail, curl --max-time {interval}) ===")
398
+ proc = subprocess.run(
399
+ ["curl"] + curl_h + ["--max-time", str(interval), run_url],
400
+ capture_output=True, text=True, timeout=interval + 10,
401
+ )
402
+ out = (proc.stdout or "") + (proc.stderr or "")
403
+ tail = out[-4000:] if len(out) > 4000 else out
404
+ for line in tail.strip().split("\n")[-25:]:
405
+ print(line)
406
+ print(f"[watch] next in {interval}s (Ctrl+C to stop)...")
407
+ time.sleep(interval)
408
+ return
409
+
410
  if args.until_ok:
411
  # 先立即查一次当前状态;已报错则马上用 curl 拉日志并退出,不空等
412
  if not os.environ.get("HF_TOKEN"):
 
471
  if not ok:
472
  sys.exit(1)
473
 
474
+ if args.ssh_test:
475
+ print(f"[ssh-test] Target: {args.ssh_user}@{args.ssh_host}:{args.ssh_port}")
476
+ print("=" * 60)
477
+ print("[Phase 1] SSH Connect")
478
+ if not test_ssh_connect(args.ssh_host, args.ssh_port, args.ssh_user, identity_file=args.ssh_key):
479
+ print("[ssh-test] CONNECT FAILED")
480
+ sys.exit(1)
481
+ print()
482
+ print("[Phase 2] SSH Command Execution")
483
+ if not test_ssh_command(args.ssh_host, args.ssh_port, args.ssh_user, identity_file=args.ssh_key):
484
+ print("[ssh-test] COMMAND EXEC FAILED")
485
+ sys.exit(1)
486
+ print()
487
+ print("[Phase 3] SSH Stress Test")
488
+ if not test_ssh_stress(args.ssh_host, args.ssh_port, args.ssh_user,
489
+ n=args.ssh_stress_n, concurrency=args.ssh_concurrency,
490
+ identity_file=args.ssh_key):
491
+ print("[ssh-test] STRESS FAILED")
492
+ sys.exit(1)
493
+ print()
494
+ print("[Phase 4] SSH Brute-force Ramp-up")
495
+ if not test_ssh_bruteforce(args.ssh_host, args.ssh_port, args.ssh_user,
496
+ identity_file=args.ssh_key):
497
+ print("[ssh-test] BRUTEFORCE FAILED")
498
+ sys.exit(1)
499
+ print("=" * 60)
500
+ print("[ssh-test] ALL SSH TESTS PASSED")
501
+ return
502
+
503
  if args.test:
504
  print(f"[test] Target: {APP_URL}")
505
  if not test_basic(APP_URL, expect_substrings=expect_substrings):
scripts/test_local.sh ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # ─────────────────────────────────────────────────────────────────────
3
+ # HuggingRun: Local integration test for Ubuntu desktop
4
+ # Build Docker → run container → wait → test noVNC + SSH + stress → cleanup
5
+ # Exit 0 only when ALL tests pass. Iterative TDD style.
6
+ #
7
+ # Usage:
8
+ # bash scripts/test_local.sh # full run
9
+ # SKIP_BUILD=1 bash scripts/test_local.sh # reuse existing image
10
+ # ─────────────────────────────────────────────────────────────────────
11
+ set -euo pipefail
12
+
13
+ REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
14
+ cd "$REPO_ROOT"
15
+
16
+ IMAGE_NAME="huggingrun-ubuntu-desktop-test"
17
+ CONTAINER_NAME="huggingrun-test-$$"
18
+ NOVNC_PORT=7860
19
+ SSH_PORT=2222
20
+ HOST_NOVNC_PORT="${HOST_NOVNC_PORT:-17860}"
21
+ HOST_SSH_PORT="${HOST_SSH_PORT:-12222}"
22
+ MAX_WAIT=120 # seconds to wait for services to be ready
23
+ SSH_USER="user"
24
+ SSH_STRESS_N="${SSH_STRESS_N:-30}"
25
+ SSH_CONCURRENCY="${SSH_CONCURRENCY:-10}"
26
+
27
+ RED='\033[0;31m'
28
+ GREEN='\033[0;32m'
29
+ YELLOW='\033[1;33m'
30
+ NC='\033[0m'
31
+
32
+ cleanup() {
33
+ echo ""
34
+ echo -e "${YELLOW}[cleanup] Stopping and removing container ${CONTAINER_NAME}...${NC}"
35
+ docker stop "$CONTAINER_NAME" 2>/dev/null || true
36
+ docker rm -f "$CONTAINER_NAME" 2>/dev/null || true
37
+ }
38
+ trap cleanup EXIT
39
+
40
+ # ── Phase 0: Build ──────────────────────────────────────────────────
41
+ if [ "${SKIP_BUILD:-}" != "1" ]; then
42
+ echo -e "${YELLOW}[build] Building Docker image: ${IMAGE_NAME}${NC}"
43
+ docker build -f Dockerfile.ubuntu-desktop -t "$IMAGE_NAME" . 2>&1 | tail -20
44
+ echo -e "${GREEN}[build] Image built successfully${NC}"
45
+ else
46
+ echo -e "${YELLOW}[build] SKIP_BUILD=1, using existing image${NC}"
47
+ fi
48
+
49
+ # ── Phase 1: Run container ──────────────────────────────────────────
50
+ echo ""
51
+ echo -e "${YELLOW}[run] Starting container: ${CONTAINER_NAME}${NC}"
52
+ echo -e "${YELLOW}[run] noVNC: localhost:${HOST_NOVNC_PORT} → :${NOVNC_PORT}${NC}"
53
+ echo -e "${YELLOW}[run] SSH: localhost:${HOST_SSH_PORT} → :${SSH_PORT}${NC}"
54
+
55
+ docker run -d \
56
+ --name "$CONTAINER_NAME" \
57
+ -p "${HOST_NOVNC_PORT}:${NOVNC_PORT}" \
58
+ -p "${HOST_SSH_PORT}:${SSH_PORT}" \
59
+ -e SSH_LISTEN=0.0.0.0 \
60
+ -e SSH_PORT=${SSH_PORT} \
61
+ "$IMAGE_NAME"
62
+
63
+ echo -e "${GREEN}[run] Container started${NC}"
64
+
65
+ # ── Phase 2: Wait for noVNC ─────────────────────────────────────────
66
+ echo ""
67
+ echo -e "${YELLOW}[wait] Waiting for noVNC on localhost:${HOST_NOVNC_PORT} (max ${MAX_WAIT}s)...${NC}"
68
+ START=$(date +%s)
69
+ NOVNC_READY=false
70
+ while [ $(($(date +%s) - START)) -lt "$MAX_WAIT" ]; do
71
+ HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" "http://localhost:${HOST_NOVNC_PORT}/vnc.html" 2>/dev/null || echo "000")
72
+ if [ "$HTTP_CODE" = "200" ]; then
73
+ NOVNC_READY=true
74
+ break
75
+ fi
76
+ echo -e " noVNC not ready (HTTP ${HTTP_CODE}), waiting 3s..."
77
+ sleep 3
78
+ done
79
+
80
+ if [ "$NOVNC_READY" = false ]; then
81
+ echo -e "${RED}[FAIL] noVNC did not become ready within ${MAX_WAIT}s${NC}"
82
+ echo ""
83
+ echo "=== Container logs (last 50 lines) ==="
84
+ docker logs --tail 50 "$CONTAINER_NAME" 2>&1
85
+ exit 1
86
+ fi
87
+ echo -e "${GREEN}[wait] noVNC is ready (HTTP 200)${NC}"
88
+
89
+ # ── Phase 3: Wait for SSH ───────────────────────────────────────────
90
+ echo ""
91
+ echo -e "${YELLOW}[wait] Waiting for SSH on localhost:${HOST_SSH_PORT} (max 60s)...${NC}"
92
+ START=$(date +%s)
93
+ SSH_READY=false
94
+ while [ $(($(date +%s) - START)) -lt 60 ]; do
95
+ if ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
96
+ -o ConnectTimeout=3 -o LogLevel=ERROR \
97
+ -p "$HOST_SSH_PORT" "${SSH_USER}@localhost" "echo ok" 2>/dev/null | grep -q "ok"; then
98
+ SSH_READY=true
99
+ break
100
+ fi
101
+ echo " SSH not ready, waiting 3s..."
102
+ sleep 3
103
+ done
104
+
105
+ if [ "$SSH_READY" = false ]; then
106
+ echo -e "${RED}[FAIL] SSH did not become ready within 60s${NC}"
107
+ echo ""
108
+ echo "=== Container logs (last 50 lines) ==="
109
+ docker logs --tail 50 "$CONTAINER_NAME" 2>&1
110
+ exit 1
111
+ fi
112
+ echo -e "${GREEN}[wait] SSH is ready${NC}"
113
+
114
+ # ── Phase 4: Run HTTP tests (noVNC) ─────────────────────────────────
115
+ echo ""
116
+ echo -e "${YELLOW}[test] Phase 4: HTTP basic + stress test on noVNC${NC}"
117
+ python3 scripts/monitor_and_test.py \
118
+ --test \
119
+ --url "http://localhost:${HOST_NOVNC_PORT}" \
120
+ --expect "noVNC" --expect "vnc" \
121
+ --stress-n 50
122
+ echo -e "${GREEN}[test] HTTP tests PASSED${NC}"
123
+
124
+ # ── Phase 5: Run SSH tests ──────────────────────────────────────────
125
+ echo ""
126
+ echo -e "${YELLOW}[test] Phase 5: SSH connect + command + stress + bruteforce${NC}"
127
+ python3 scripts/monitor_and_test.py \
128
+ --ssh-test \
129
+ --ssh-host localhost \
130
+ --ssh-port "$HOST_SSH_PORT" \
131
+ --ssh-user "$SSH_USER" \
132
+ --ssh-stress-n "$SSH_STRESS_N" \
133
+ --ssh-concurrency "$SSH_CONCURRENCY"
134
+ echo -e "${GREEN}[test] SSH tests PASSED${NC}"
135
+
136
+ # ── Summary ─────────────────────────────────────────────────────────
137
+ echo ""
138
+ echo "============================================================"
139
+ echo -e "${GREEN} ALL TESTS PASSED${NC}"
140
+ echo ""
141
+ echo " noVNC desktop: http://localhost:${HOST_NOVNC_PORT}/vnc.html"
142
+ echo " SSH access: ssh -p ${HOST_SSH_PORT} ${SSH_USER}@localhost"
143
+ echo "============================================================"
scripts/verify_overnight.sh ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Overnight verification: 3 full --until-ok runs. Exit 0 only if all pass.
3
+ # Usage: from repo root, with .env containing HF_TOKEN:
4
+ # bash scripts/verify_overnight.sh
5
+ set -e
6
+ REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
7
+ cd "$REPO_ROOT"
8
+ LOG="$REPO_ROOT/docs/verification_run.log"
9
+ APP_URL="${APP_URL:-https://tao-shen-huggingrun.hf.space}"
10
+ EXPECT="${EXPECT:-Directory listing}"
11
+ ROUNDS="${ROUNDS:-3}"
12
+
13
+ if [ ! -f .env ]; then
14
+ echo "Missing .env (HF_TOKEN required)" >&2
15
+ exit 1
16
+ fi
17
+ export $(grep -v '^#' .env | xargs)
18
+
19
+ echo "=== Overnight verification started $(date -u +%Y-%m-%dT%H:%M:%SZ) ===" | tee -a "$LOG"
20
+ echo "APP_URL=$APP_URL EXPECT=$EXPECT ROUNDS=$ROUNDS" | tee -a "$LOG"
21
+
22
+ PASSED=0
23
+ for r in $(seq 1 "$ROUNDS"); do
24
+ echo "" | tee -a "$LOG"
25
+ echo "--- Round $r/$ROUNDS at $(date -u +%H:%M:%SZ) ---" | tee -a "$LOG"
26
+ if python3 scripts/monitor_and_test.py --until-ok --url "$APP_URL" --expect "$EXPECT" --stress-n 50 >> "$LOG" 2>&1; then
27
+ PASSED=$((PASSED+1))
28
+ echo "Round $r PASSED" | tee -a "$LOG"
29
+ else
30
+ echo "Round $r FAILED" | tee -a "$LOG"
31
+ exit 1
32
+ fi
33
+ [ "$r" -lt "$ROUNDS" ] && sleep 30
34
+ done
35
+
36
+ echo "" | tee -a "$LOG"
37
+ echo "=== ALL $ROUNDS ROUNDS PASSED at $(date -u +%Y-%m-%dT%H:%M:%SZ) ===" | tee -a "$LOG"
38
+ exit 0
ubuntu-desktop/start-desktop.sh CHANGED
@@ -36,19 +36,42 @@ echo "[start-desktop] XFCE started, starting x11vnc ..." >&2
36
  # x11vnc: share display :99 on port 5901 (do not exit on failure so noVNC can still start)
37
  x11vnc -display "$DISPLAY" -rfbport "$VNC_PORT" -forever -shared -noxdamage -nopw -bg || true
38
 
39
- # SSH (optional): do not let failures here stop noVNC
40
  set +e
41
  SSHD_PORT="${SSH_PORT:-2222}"
 
42
  mkdir -p "$HOME/.ssh"
 
 
43
  [ -n "${SSH_AUTHORIZED_KEYS-}" ] && echo "$SSH_AUTHORIZED_KEYS" > "$HOME/.ssh/authorized_keys" && chmod 600 "$HOME/.ssh/authorized_keys"
44
- [ ! -f "$HOME/.ssh/ssh_host_ed25519_key" ] && ssh-keygen -t ed25519 -f "$HOME/.ssh/ssh_host_ed25519_key" -N "" -C "" 2>/dev/null
45
- if [ -f "$HOME/.ssh/authorized_keys" ] && [ -f "$HOME/.ssh/ssh_host_ed25519_key" ]; then
46
- sshd -o "Port=$SSHD_PORT" -o "HostKey=$HOME/.ssh/ssh_host_ed25519_key" \
47
- -o "AuthorizedKeysFile=$HOME/.ssh/authorized_keys" \
48
- -o "PermitEmptyPasswords=no" -o "PasswordAuthentication=no" \
49
- -o "ListenAddress=127.0.0.1" -o "PidFile=$HOME/.ssh/sshd.pid" \
50
- -o "UsePAM=no" -o "PermitUserEnvironment=yes" -D -e &
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  sleep 1
 
 
 
52
  [ -n "${SSH_REVERSE_TARGET-}" ] && ssh -o StrictHostKeyChecking=no -o ServerAliveInterval=60 -R "0.0.0.0:${SSHD_PORT}:127.0.0.1:${SSHD_PORT}" $SSH_REVERSE_TARGET -N &
53
  fi
54
  set -e
 
36
  # x11vnc: share display :99 on port 5901 (do not exit on failure so noVNC can still start)
37
  x11vnc -display "$DISPLAY" -rfbport "$VNC_PORT" -forever -shared -noxdamage -nopw -bg || true
38
 
39
+ # SSH: always start sshd; do not let failures here stop noVNC
40
  set +e
41
  SSHD_PORT="${SSH_PORT:-2222}"
42
+ SSHD_LISTEN="${SSH_LISTEN:-0.0.0.0}"
43
  mkdir -p "$HOME/.ssh"
44
+
45
+ # If SSH_AUTHORIZED_KEYS is set, use key-based auth only; otherwise allow password auth for local testing
46
  [ -n "${SSH_AUTHORIZED_KEYS-}" ] && echo "$SSH_AUTHORIZED_KEYS" > "$HOME/.ssh/authorized_keys" && chmod 600 "$HOME/.ssh/authorized_keys"
47
+
48
+ # Use pre-generated host key from Docker build, or generate at runtime
49
+ HOST_KEY="$HOME/.ssh/ssh_host_ed25519_key"
50
+ [ ! -f "$HOST_KEY" ] && cp /home/user/.ssh/ssh_host_ed25519_key "$HOST_KEY" 2>/dev/null
51
+ [ ! -f "$HOST_KEY" ] && ssh-keygen -t ed25519 -f "$HOST_KEY" -N "" -C "" 2>/dev/null
52
+
53
+ if [ -f "$HOST_KEY" ]; then
54
+ if [ -f "$HOME/.ssh/authorized_keys" ]; then
55
+ # Key-based auth only (production / HF Spaces)
56
+ echo "[start-desktop] Starting sshd (key auth) on $SSHD_LISTEN:$SSHD_PORT ..." >&2
57
+ /usr/sbin/sshd -o "Port=$SSHD_PORT" -o "HostKey=$HOST_KEY" \
58
+ -o "AuthorizedKeysFile=$HOME/.ssh/authorized_keys" \
59
+ -o "PermitEmptyPasswords=no" -o "PasswordAuthentication=no" \
60
+ -o "ListenAddress=$SSHD_LISTEN" -o "PidFile=$HOME/.ssh/sshd.pid" \
61
+ -o "UsePAM=no" -o "PermitUserEnvironment=yes" -D -e &
62
+ else
63
+ # No keys configured: allow password-less login for local Docker testing
64
+ echo "[start-desktop] Starting sshd (no-password, local test) on $SSHD_LISTEN:$SSHD_PORT ..." >&2
65
+ /usr/sbin/sshd -o "Port=$SSHD_PORT" -o "HostKey=$HOST_KEY" \
66
+ -o "PermitEmptyPasswords=yes" -o "PasswordAuthentication=yes" \
67
+ -o "ListenAddress=$SSHD_LISTEN" -o "PidFile=$HOME/.ssh/sshd.pid" \
68
+ -o "UsePAM=no" -o "PermitRootLogin=no" -D -e &
69
+ fi
70
+ SSHD_PID=$!
71
  sleep 1
72
+ echo "[start-desktop] sshd PID=$SSHD_PID" >&2
73
+
74
+ # Reverse SSH tunnel (HF Spaces: outbound only on 80/443/8080)
75
  [ -n "${SSH_REVERSE_TARGET-}" ] && ssh -o StrictHostKeyChecking=no -o ServerAliveInterval=60 -R "0.0.0.0:${SSHD_PORT}:127.0.0.1:${SSHD_PORT}" $SSH_REVERSE_TARGET -N &
76
  fi
77
  set -e