Baladithya Balamurugan
Wave 2: 4 new modules (kill-switch, EKS/SageMaker executors, DockerSandbox) + B4/B7 completion
7a55e1e
Raw
History Blame Contribute Delete
4.95 kB
{
"area": "composer_replication/datagen/docker_sandbox.py + sandbox.py scrub_tree refactor",
"verdict": "clean",
"findings": [
{
"severity": "low",
"what": "run_tests pass/fail parse carries the order-dependent fallback clause `if f\"{t} PASSED\" in out or (returncode == 0 and not failed)` verbatim from LocalSubprocessSandbox. If a runner exits 0 but does not print '<nodeid> PASSED' for every node id, the first un-printed node is marked passed solely on the exit code (and `not failed` is true only until the first failure is recorded). This is a pre-existing pattern (identical on main's LocalSubprocessSandbox at sandbox.py:214) faithfully mirrored into DockerSandbox, NOT a new regression — flagged only for completeness.",
"where": "composer_replication/datagen/docker_sandbox.py:272-276 (and the source LocalSubprocessSandbox at sandbox.py:212-217)",
"recommendation": "No action required for this review. If ever hardened, require an explicit PASSED token per node id and stop trusting the bare exit code; do it in both sandboxes together so they stay in lock-step."
}
],
"confirmed_good": [
"REFACTOR DID NOT BREAK LocalSubprocessSandbox: boot() still scrubs — boot() (sandbox.py:169-172) calls self._scrub_tree() which delegates to the shared module-level scrub_tree() free function (sandbox.py:174-177). Smoke test confirmed __pycache__, .git, and *.pyc are removed on boot while real source (keep.py) survives.",
"No broken/dangling references to the old per-class _scrub_tree: the only remaining _scrub_tree occurrences are (a) the intentional back-compat delegating method + its self-call in boot, and (b) one descriptive comment in test_docker_substrate_e2e.py:161. grep for external callers of .SCRUB_NAMES/._SCRUB_NAMES/.SCRUB_SUFFIXES returned EMPTY.",
"Back-compat preserved: LocalSubprocessSandbox._SCRUB_NAMES / ._SCRUB_SUFFIXES class aliases still point at the module-level SCRUB_NAMES/SCRUB_SUFFIXES; the _scrub_tree() method is retained.",
"FeatureDeletionEnv unaffected: env.py uses the Sandbox Protocol generically (boot/exec/run_tests/trajectory at env.py:59,69,86,89) — agnostic to the scrub refactor.",
"SCRUB-BEFORE-MOUNT ORDERING IS CORRECT (no security bug): DockerSandbox.boot() runs scrub_tree(self.workdir) at line 190 BEFORE self._client.containers.run(**kwargs) at line 198. The container (and thus the RW bind mount) does not exist when the host-side scrub runs, so the scrub is provably pre-mount. The scrub-AFTER-mount security bug the audit asked to rule out is NOT present.",
"--network none: both network_disabled=True AND network_mode='none' set (docker_sandbox.py:154-155); live test_live_network_is_disabled actually ran on a real container and asserted egress BLOCKED / not CONNECTED.",
"Resource limits: mem_limit == memswap_limit (forbids swap), pids_limit (fork-bomb guard), nano_cpus (CPU quota); all present, configurable, and unit-asserted.",
"Ephemeral teardown: close() force-removes (idempotent, swallows errors), reap_leaked() sweeps label-filtered orphan containers at boot and shutdown, __enter__/__exit__/__del__ wired. Verified by test_close_removes_container_force, test_context_manager_closes, test_reap_leaked_sweeps_labelled_containers.",
"gVisor runtime option: runtime defaults to None (=> 'runtime' kwarg omitted, daemon-default runc); 'runsc' is only passed through when explicitly set (docker_sandbox.py:178-179) and gated by runsc_available(). test_live_runsc_runtime correctly SKIPPED (gVisor not installed on host).",
"Lazy docker import: _require_docker() imports `docker` inside the function with a clear RuntimeError on ImportError; docker SDK is never required by the FakeSandbox/pure-core path. Verified by test_require_docker_missing_sdk_raises.",
"Privilege lockdown: cap_drop=['ALL'], security_opt=['no-new-privileges:true'], user='1000:1000' (non-root), read_only root fs with tmpfs /tmp (noexec,nosuid), keep_root_writable escape hatch.",
"shlex.quote applied to every test node id in run_tests (shell-injection guard, matches LocalSubprocessSandbox); non-UTF-8 output decoded with errors='replace' (test_exec_decodes_non_utf8_bytes); exec wraps commands in coreutils `timeout`.",
"TEST SUITE: `.venv/bin/python -m pytest composer_replication/datagen -q` => 61 passed, 1 skipped (runsc only). The LIVE Docker E2E genuinely RAN (not skipped): test_live_four_inversion_gates_in_hardened_container, test_live_network_is_disabled, test_live_cache_scrub_removes_bytecode all PASSED on a real python:3.11-slim container. The long-blocked D1 substrate E2E (test_docker_substrate_e2e.py) is also GREEN (2/2). Broader regression datagen+safety+serverless => 137 passed, 18 skipped, no failures.",
"Public surface re-exports DockerSandbox and scrub_tree from composer_replication/datagen/__init__.py and __all__; package imports cleanly."
],
"new_backlog_items": []
}