Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Docs Refine 2026-06 — Change Summary
Branch:
docs/refine-2026-06(offmasterHEADaae66fa). Docs-only. MERGED intomainas of4e6e82e(merge commit "Merge docs/refine-2026-06"), after the 3 documented waves (20e3bd9,f00833d,e130879) plus 3 reconciliation commits (ace6dd4,5e64616,d7e4b4e) that retired the now-resolved main-lags-master foot-gun — 6 commits total in rangefb13ea3..4e6e82e, not the 3 this summary originally listed. (This header was updated 2026-06-09 to reflect the merged reality.)
This engagement refined the documentation corpus to (1) enforce the ground-truth provenance
correction recorded in ADR-014, (2)
archive point-in-time historical artifacts without breaking references, and (3) add a single
honest newcomer overview. No .py, pyproject.toml, or any file under
composer_replication/ examples/ spikes/ tests/ was touched — proven by
git diff --name-only aae66fa..HEAD showing only .md paths (see end of this doc).
Method
Plan → parallel read-only audit (4 agents over the living docs) → apply fixes in the main thread → two independent adversarial review passes (one per the post-wave-1 commit, one over the whole changeset) → iterate to convergence. Both adversaries + a deterministic link/invariant script signed off with zero blockers.
Commits
| SHA | Wave | Theme |
|---|---|---|
20e3bd9 |
Wave 1 | Correctness: channel-3 provenance, gap honesty, dead links |
f00833d |
Wave 2 | Archive point-in-time wave reviews + dated review bundles (move + redirect stub) |
e130879 |
Wave 3 | Add docs/OVERVIEW.md, index ADR-014, fold in adversarial-review corrections |
Files touched — what changed and why
Wave 1 — correctness (commit 20e3bd9)
| File | Change | Fact |
|---|---|---|
README.md |
v0.1 roadmap cell reframed: "Full Composer recipe" = channels 1 (Dr.GRPO) + 2 (SDPO); trace-replay-DPO labelled the framework's own addition with an ADR-014 link. | A |
docs/HF_REPO_LAYOUT.md |
v0 and v1 trained-variant rows: stop bundling trace-replay-DPO into "Composer recipe"; mark it additive. | A |
docs/VISION_VALIDATION.md |
Status banner: stale "210 passing tests" → "115 + 1 skip-marked" pointing to the canonical V1_V8_COVERAGE.md; note the PO-objective menu (default Dr.GRPO, ADR-014); keep both honest gaps OPEN (Docker e2e; A1-done / A2–A4-scaffold). |
B, D |
docs/ALTERED_MINDS_TIE_IN.md |
Phase-3: only A1 has a real Modal runner; A2/A3/A4 scaffold + plan-builder only, blocked on dataset construction; real 8B run additionally user-gated. Added a strip_thinking=False-for-SDPO foot-gun note. |
D, E |
docs/USER_GUIDE.md |
Clone+install block: add git checkout master + a branch foot-gun callout (HF main lags master; else ImportError on make_dr_grpo_config). |
F |
BACKLOG.md |
Fixed two dead paths examples/qwen3_05b_quickstart/ → examples/qwen_05b_quickstart/. |
dead-link |
docs/INTEGRATION_RECIPES.md |
Fixed dead link ADR-007-distillation-losses.md → ADR-007-self-distillation-losses.md. |
dead-link |
framework/composer-replication-framework.md |
Fixed 2 root-relative links that 404 from a subdirectory on HF Hub (docs/…, spikes/… → ../…). |
dead-link |
publications/HF_DISCUSSION_POST.md |
Fixed 7 root-relative links (same subdirectory-resolution issue → ../ / same-dir). |
dead-link |
Wave 2 — archive historical artifacts (commit f00833d)
Moved (via git mv, history preserved) into archives, with a one-line redirect stub left
at every original path so prose references — including those baked into immutable accepted ADRs
(ADR-007/008/012) and off-limits spike verdicts that cannot be edited — keep resolving:
- →
docs/research/_archive/:WAVE_7_10_FINAL_REVIEW.md,WAVE_13/14/15_FINAL_REVIEW.md. - →
docs/_archive/:DEEP_WORK_LOOP_LOG.md,WAVE_COMPOSER_DATAGEN_RL_2026-05-29.md. - →
docs/_archive/reviews/: the two dated review bundles (cross-family-adr-008-009-010-2026-05-29/,final-verify-deep-work-2026-05-29/) — all 8 per-model review/verify files moved as renames; aSYNTHESIS.mdredirect stub remains at each origin (the entry point ADRs cite by directory). - Added
docs/_archive/README.mdanddocs/research/_archive/README.mdindexing what was archived and why ("point-in-time, superseded by current METHODOLOGY / BACKLOG / V1_V8_COVERAGE / ADRs"), extending the existing_archiveconvention (WAVE_16_RECON_AUDIT.mdalready lived indocs/research/_archive/).
Wave 3 — new overview + ADR index + review fixes (commit e130879)
| File | Change |
|---|---|
docs/OVERVIEW.md |
New. 5-minute newcomer tour: what it is, the three channels with honest provenance (1+2 = genuine Composer replication; 3 = framework's additive channel), what's proven (CPU SDPO-fires, A1 8B Modal run, GSM8K GRPO, $0.98/trace, 115 tests), what's gapped (Docker e2e, A2–A4 ladder), day-one foot-guns (main-lag, strip_thinking, k1/k3 delta, compose_loss-is-harness). Linked from README + both _archive READMEs. |
README.md |
Added a "🧭 New here? → OVERVIEW.md" pointer + clarified the intro that trace-replay is the framework's own addition (not Cursor's). Added the master-branch guard to the Install block (adversary finding). |
docs/adrs/README.md |
Added the missing ADR-014 row + a provenance note recording the channel-3 correction. |
docs/ALTERED_MINDS_TIE_IN.md, docs/VISION_VALIDATION.md |
Adversary corrections: dropped the parent-commit SHA mislabelled as "HEAD"; re-attributed the A2–A4 gap claim to cite ADR-014 only for "the A1 run used dr_grpo" (and ADR-013 for its sole user-gated box) instead of ADR-014's acceptance gate, which doesn't contain the dataset-construction detail; fixed one more stale qwen3_05b_quickstart path. |
Deliberately left alone — and why
- Accepted ADR bodies (ADR-001…014). Immutable once
accepted(per the ADR index's own rule). Only the ADR index (docs/adrs/README.md) was updated. The provenance correction was propagated into the living docs that ADR-014 supersedes, not by editing older ADRs. research/01..12,framework/,publications/PAPER_v0.mddeep-dive bodies. Preserved as point-in-time research snapshots (only the 9 dead links inframework/+publications/HF_DISCUSSION_POST.mdwere repaired).docs/COMPOSER_RECIPE_MAPPING.mdanddocs/METHODOLOGY.mdwere audited and found already correct on channel-3 provenance (they already frame channel 3 as "NOVEL — our addition / not in Composer"), so they were not rewritten.docs/VISION_VALIDATION.mddated update blocks (e.g. the "77 tests" Wave-12 line). These are explicitly-dated historical self-audit snapshots in the doc's house style; only the current top status banner was refreshed. Rewriting dated snapshots would falsify the audit trail.- The two
qwen3_7b_*proposals in VISION_VALIDATION (lines ~80, ~138). These accurately record example dirs that were proposed in the Wave-6 audit but never built under any name; "fixing" them to the 0.5B path would misrepresent history. Only the line describing the packaging deliverable that actually shipped (qwen_05b_quickstart) was corrected.
Notes for a maintainer (issues found but NOT fixable under docs-only / scope)
- Off-limits dead link.
examples/gsm8k_grpo_with_sdpo/README.md:66links todocs/adrs/ADR-002-channel2-sdpo.md, which does not exist (ADR-002 isADR-002-trace-source.md; the SDPO design decision is ADR-008). This file is underexamples/(off-limits to this docs-only engagement), so it was left unchanged. Recommended fix: repoint that link todocs/adrs/ADR-008-drgrpo-sdpo-live-channel.md. - API_REFERENCE freshness gap.
docs/API_REFERENCE.mddocuments thecomposer_replicationsurface but has no section for the trainer config factories — neithermake_dr_grpo_config(ADR-008) nor the newmake_po_config(objective=…)/PO_OBJECTIVESmenu (ADR-014). This is a missing doc, not a wrong one; it was not added here because the public signatures could not be verified against source without reading the.py(out of scope), and Invariant 7 forbids fabricating an API surface. Recommended fix: add a config-factory section to API_REFERENCE from the verifiedcomposer_replication/trainer/composer_trainer.pysignatures.
Verification (proof, not claim)
- Docs-only invariant:
git diff --name-only aae66fa..HEAD→ every path ends in.md(no.py, nopyproject.toml, nothing undercomposer_replication/ examples/ spikes/ tests/). - Link integrity: a scripted scan of every relative
[](…)link across all in-scope.md(root +docs/+framework/+publications/, excluding the 4 off-limits trees) reports zero dead links in the changeset. The one known dead link (ADR-002-channel2-sdpo.md) is pre-existing and lives in off-limitsexamples/— see note 1. - Archive integrity: every archived file has both a redirect stub at its origin and a
full-content copy under
_archive/; all 8 review-dir files preserved (renames, no content loss). - Two adversarial reviews + a deterministic check all returned zero blockers.