Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Baladithya Balamurugan
Wave 2: 4 new modules (kill-switch, EKS/SageMaker executors, DockerSandbox) + B4/B7 completion
7a55e1e | { | |
| "area": "composer_replication/datagen/docker_sandbox.py + sandbox.py scrub_tree refactor", | |
| "verdict": "clean", | |
| "findings": [ | |
| { | |
| "severity": "low", | |
| "what": "run_tests pass/fail parse carries the order-dependent fallback clause `if f\"{t} PASSED\" in out or (returncode == 0 and not failed)` verbatim from LocalSubprocessSandbox. If a runner exits 0 but does not print '<nodeid> PASSED' for every node id, the first un-printed node is marked passed solely on the exit code (and `not failed` is true only until the first failure is recorded). This is a pre-existing pattern (identical on main's LocalSubprocessSandbox at sandbox.py:214) faithfully mirrored into DockerSandbox, NOT a new regression — flagged only for completeness.", | |
| "where": "composer_replication/datagen/docker_sandbox.py:272-276 (and the source LocalSubprocessSandbox at sandbox.py:212-217)", | |
| "recommendation": "No action required for this review. If ever hardened, require an explicit PASSED token per node id and stop trusting the bare exit code; do it in both sandboxes together so they stay in lock-step." | |
| } | |
| ], | |
| "confirmed_good": [ | |
| "REFACTOR DID NOT BREAK LocalSubprocessSandbox: boot() still scrubs — boot() (sandbox.py:169-172) calls self._scrub_tree() which delegates to the shared module-level scrub_tree() free function (sandbox.py:174-177). Smoke test confirmed __pycache__, .git, and *.pyc are removed on boot while real source (keep.py) survives.", | |
| "No broken/dangling references to the old per-class _scrub_tree: the only remaining _scrub_tree occurrences are (a) the intentional back-compat delegating method + its self-call in boot, and (b) one descriptive comment in test_docker_substrate_e2e.py:161. grep for external callers of .SCRUB_NAMES/._SCRUB_NAMES/.SCRUB_SUFFIXES returned EMPTY.", | |
| "Back-compat preserved: LocalSubprocessSandbox._SCRUB_NAMES / ._SCRUB_SUFFIXES class aliases still point at the module-level SCRUB_NAMES/SCRUB_SUFFIXES; the _scrub_tree() method is retained.", | |
| "FeatureDeletionEnv unaffected: env.py uses the Sandbox Protocol generically (boot/exec/run_tests/trajectory at env.py:59,69,86,89) — agnostic to the scrub refactor.", | |
| "SCRUB-BEFORE-MOUNT ORDERING IS CORRECT (no security bug): DockerSandbox.boot() runs scrub_tree(self.workdir) at line 190 BEFORE self._client.containers.run(**kwargs) at line 198. The container (and thus the RW bind mount) does not exist when the host-side scrub runs, so the scrub is provably pre-mount. The scrub-AFTER-mount security bug the audit asked to rule out is NOT present.", | |
| "--network none: both network_disabled=True AND network_mode='none' set (docker_sandbox.py:154-155); live test_live_network_is_disabled actually ran on a real container and asserted egress BLOCKED / not CONNECTED.", | |
| "Resource limits: mem_limit == memswap_limit (forbids swap), pids_limit (fork-bomb guard), nano_cpus (CPU quota); all present, configurable, and unit-asserted.", | |
| "Ephemeral teardown: close() force-removes (idempotent, swallows errors), reap_leaked() sweeps label-filtered orphan containers at boot and shutdown, __enter__/__exit__/__del__ wired. Verified by test_close_removes_container_force, test_context_manager_closes, test_reap_leaked_sweeps_labelled_containers.", | |
| "gVisor runtime option: runtime defaults to None (=> 'runtime' kwarg omitted, daemon-default runc); 'runsc' is only passed through when explicitly set (docker_sandbox.py:178-179) and gated by runsc_available(). test_live_runsc_runtime correctly SKIPPED (gVisor not installed on host).", | |
| "Lazy docker import: _require_docker() imports `docker` inside the function with a clear RuntimeError on ImportError; docker SDK is never required by the FakeSandbox/pure-core path. Verified by test_require_docker_missing_sdk_raises.", | |
| "Privilege lockdown: cap_drop=['ALL'], security_opt=['no-new-privileges:true'], user='1000:1000' (non-root), read_only root fs with tmpfs /tmp (noexec,nosuid), keep_root_writable escape hatch.", | |
| "shlex.quote applied to every test node id in run_tests (shell-injection guard, matches LocalSubprocessSandbox); non-UTF-8 output decoded with errors='replace' (test_exec_decodes_non_utf8_bytes); exec wraps commands in coreutils `timeout`.", | |
| "TEST SUITE: `.venv/bin/python -m pytest composer_replication/datagen -q` => 61 passed, 1 skipped (runsc only). The LIVE Docker E2E genuinely RAN (not skipped): test_live_four_inversion_gates_in_hardened_container, test_live_network_is_disabled, test_live_cache_scrub_removes_bytecode all PASSED on a real python:3.11-slim container. The long-blocked D1 substrate E2E (test_docker_substrate_e2e.py) is also GREEN (2/2). Broader regression datagen+safety+serverless => 137 passed, 18 skipped, no failures.", | |
| "Public surface re-exports DockerSandbox and scrub_tree from composer_replication/datagen/__init__.py and __all__; package imports cleanly." | |
| ], | |
| "new_backlog_items": [] | |
| } | |