3.79 MB

Ctrl+K

2 contributors

History: 21 commits

prometheus04

cleanup: remove botched-filename trash from repo root

df5ea74 verified 23 days ago

artifacts
Upload artifacts/train.log with huggingface_hub about 1 month ago
configs
GPU-session fixes (RNG cpu, shard filter, cu124, 3090 config) about 1 month ago
logs
Add AdamW vs Muon optimizer ablation results (300M tokens/variant) 30 days ago
notebooks
Matilda-Mini phases 1-5 + runbook about 1 month ago
results
Add AdamW vs Muon optimizer ablation results (300M tokens/variant) 30 days ago
scripts
Upload scripts including export_hf.py and ablate.py fixes about 1 month ago
src
Fix RoPE dtype cast for bfloat16 inference about 1 month ago
tests
second review fixes about 1 month ago
.gitattributes

1.52 kB
initial commit about 1 month ago
.gitignore

200 Bytes
add ablation harness about 1 month ago
README.md

5 kB
Muon optimizer + README about 1 month ago
config.json

488 Bytes
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5% about 1 month ago
configuration_matilda.py

703 Bytes
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5% about 1 month ago
conftest.py

92 Bytes
Matilda-Mini phases 1-5 + runbook about 1 month ago
modeling_matilda.py

2.09 kB
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5% about 1 month ago
pytest.ini

85 Bytes
Matilda-Mini phases 1-5 + runbook about 1 month ago
requirements.txt

265 Bytes
second review fixes about 1 month ago
run.py

2.95 kB
Matilda-Mini phases 1-5 + runbook about 1 month ago
tokenizer.json

3.56 MB
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5% about 1 month ago
tokenizer_config.json

315 Bytes
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5% about 1 month ago