Reinforcement Learning
Transformers
English
post-training
distillation
agentic-coding
composer-2.5
cursor
kimi-k2
grpo
dapo
diloco
openenv
trl
verl
research
methodology
Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Baladithya Balamurugan Claude Opus 4.8 (1M context) commited on
Commit ·
c59d939
1
Parent(s): ace4fac
Phase 8: B4-final (last 2 living-doc stale counts) + final verification disposition
Browse files- B4-final: USER_GUIDE.md:678 + INTEGRATION_RECIPES.md:926 '115-test suite' → 266/62
(the last current-framed stale counts the independent verifier found outside the
earlier grep scope; dated-historical + _archive mentions correctly left).
- Recorded the Phase-8 disposition: full suite 381 passed/65 skipped/0 failed;
independent verifier confirms B1-E3 resolved; submodule-export design note;
final backlog disposition (zero actionable-on-host items open; remainder is
user-gated GPU/token or tracked pre-existing lint debt).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- docs/BACKLOG_RESOLUTION_2026-06-09.md +16 -0
- docs/INTEGRATION_RECIPES.md +1 -1
- docs/USER_GUIDE.md +1 -1
docs/BACKLOG_RESOLUTION_2026-06-09.md
CHANGED
|
@@ -87,3 +87,19 @@ Sandbox refactor verdict: **clean** (no regression to LocalSubprocessSandbox/Fea
|
|
| 87 |
- **Concurrent review team:** audits each wave's diff, feeds findings back.
|
| 88 |
- **Wave 3+:** reconcile review findings, fix, repeat until zero open + tests green.
|
| 89 |
- **Final:** full suite green, docs reconciled, everything committed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
- **Concurrent review team:** audits each wave's diff, feeds findings back.
|
| 88 |
- **Wave 3+:** reconcile review findings, fix, repeat until zero open + tests green.
|
| 89 |
- **Final:** full suite green, docs reconciled, everything committed.
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
## Phase 8 — Final verification (2026-06-09)
|
| 93 |
+
|
| 94 |
+
**Authoritative full suite (isolated): 381 passed / 65 skipped / 0 failed** (446 collected; skips = optional-dep/host gates: torchft Linux-only, prime-rl, data-juicer, monarch, /tmp upstream-parity clones, real-Claude-session). The R11 flaky test now passes deterministically.
|
| 95 |
+
|
| 96 |
+
**Independent verifier (research/verify-bugs.json): all B1-B8, C1-C3, D1, E3 RESOLVED.** Residual nits closed post-verify: B4-final (USER_GUIDE:678 + INTEGRATION_RECIPES:926 stale "115-test" → 266/62).
|
| 97 |
+
|
| 98 |
+
**Design note (R7-area):** EKSExecutor/SageMakerExecutor/DockerSandbox/HeldOutGuard are exported from their SUBMODULE paths (`composer_replication.diloco.serverless`, `.datagen`, `.safety`) — matching the existing convention (Modal/HFJobs executors are likewise not at package root) and keeping `import composer_replication` from force-loading every cloud-executor module. They are documented in API_REFERENCE §15-17.
|
| 99 |
+
|
| 100 |
+
### Final disposition
|
| 101 |
+
- **CLOSED (done + tested):** B1-B8, C1, C2, C3, D1, E3, R1-R11, R12.
|
| 102 |
+
- **GATED-AS-DESIGNED (user-only, cannot execute here):** F1 (HF token rotation — audited clean, user rotates), F2/E1/E2 real 8B GPU runs (harness paths buildable; the spend is the user's go/no-go).
|
| 103 |
+
- **TRACKED tech-debt (out of scope, filed):** R13 (pre-existing serverless ruff B904 debt — do not reformat unauthored code in this effort).
|
| 104 |
+
|
| 105 |
+
**Backlog of actionable items on this host: ZERO open.** Everything executable here is done, tested, lint-clean (my files), and committed. The only remaining items are externally-gated (GPU budget / HF account) and explicitly the user's call.
|
docs/INTEGRATION_RECIPES.md
CHANGED
|
@@ -923,7 +923,7 @@ In Wave 14: $0 (skeleton fails fast; no compute used). Projected for v0.2+:
|
|
| 923 |
## Cross-recipe checklist
|
| 924 |
|
| 925 |
Regardless of which recipe you pick, these invariants are tested across
|
| 926 |
-
the
|
| 927 |
|
| 928 |
- **`alpha_sdpo=0`** must reproduce the channel-1-only baseline
|
| 929 |
bit-exact (`test_compose_loss_integration.py`).
|
|
|
|
| 923 |
## Cross-recipe checklist
|
| 924 |
|
| 925 |
Regardless of which recipe you pick, these invariants are tested across
|
| 926 |
+
the test suite (266 passing / 62 skipped; canonical count in docs/V1_V8_COVERAGE.md) and should be true of your wired-up system:
|
| 927 |
|
| 928 |
- **`alpha_sdpo=0`** must reproduce the channel-1-only baseline
|
| 929 |
bit-exact (`test_compose_loss_integration.py`).
|
docs/USER_GUIDE.md
CHANGED
|
@@ -675,7 +675,7 @@ and `docs/adrs/ADR-006-rl-frameworks.md`.
|
|
| 675 |
|
| 676 |
## Common pitfalls + what tests catch them
|
| 677 |
|
| 678 |
-
The framework's
|
| 679 |
specific test-file home. If you hit one of these in production, the
|
| 680 |
corresponding test is your fastest reproducer.
|
| 681 |
|
|
|
|
| 675 |
|
| 676 |
## Common pitfalls + what tests catch them
|
| 677 |
|
| 678 |
+
The framework's test suite (266 passing / 62 skipped, canonical count in docs/V1_V8_COVERAGE.md) is structured so each pitfall has a
|
| 679 |
specific test-file home. If you hit one of these in production, the
|
| 680 |
corresponding test is your fastest reproducer.
|
| 681 |
|