v1.5b: MIXES preset-aware (350m EN47/TR30 tc100b/code13/math10) + ShardStream mix param de5424b verified kdirgul commited on about 7 hours ago
faz3_train: fix print satiri args.n_layers->cfg[n_layers] (preset ile None oluyordu) 33ad0f9 verified kdirgul commited on 1 day ago
faz3_train: --preset 350m (cfg parametrize; 177m=177.1M v1-birebir, 350m=348.6M dogrulandi); --attn_every/--d_model vb override 699d6f3 verified kdirgul commited on 1 day ago
faz3_train: Muon optimizer entegrasyonu (--muon: 2D-Linear Muon + embed/norm AdamW; cklu-opt state + WSD-carpan; geriye uyumlu) 563bb36 verified kdirgul commited on 3 days ago
Faz 3 trainer (fork hibrit + WSD + resumable ckpt/HF push); arrow-col + resume fix 75beef3 verified kdirgul commited on 23 days ago