Commit History

feat(trainer): ADR-008 Dr.GRPO config + SDPO strict-alignment guard
bde5c5e

Codeseys commited on

Wave 21c: verify PRIME-RL adapter parity against upstream source (byte-for-byte)
c98928e

Codeseys commited on

Wave 15: 4-angle multi-model self-critique caught 2 math BLOCKERs in primary loss kernels; fixed against upstream byte-for-byte + GSM8K example + ergonomics
e5add15

Codeseys commited on

Wave 14: close every Wave 13 review finding + 4 documentation files; Wave 14b: real PRIME-RL parity + multi-process DiLoCo convergence
d9dd3a5

Codeseys commited on

Wave 13: serverless DiLoCo + replaysim normalization + 3 distillation losses + PRIME-RL + Monarch
b266c31

Codeseys commited on