Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
prometheus04
/
matilda-mini
like
0
matilda
custom_code
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
matilda-mini
3.79 MB
Ctrl+K
Ctrl+K
2 contributors
History:
21 commits
prometheus04
cleanup: remove botched-filename trash from repo root
df5ea74
verified
23 days ago
artifacts
Upload artifacts/train.log with huggingface_hub
about 1 month ago
configs
GPU-session fixes (RNG cpu, shard filter, cu124, 3090 config)
about 1 month ago
logs
Add AdamW vs Muon optimizer ablation results (300M tokens/variant)
30 days ago
notebooks
Matilda-Mini phases 1-5 + runbook
about 1 month ago
results
Add AdamW vs Muon optimizer ablation results (300M tokens/variant)
30 days ago
scripts
Upload scripts including export_hf.py and ablate.py fixes
about 1 month ago
src
Fix RoPE dtype cast for bfloat16 inference
about 1 month ago
tests
second review fixes
about 1 month ago
.gitattributes
Safe
1.52 kB
initial commit
about 1 month ago
.gitignore
Safe
200 Bytes
add ablation harness
about 1 month ago
README.md
Safe
5 kB
Muon optimizer + README
about 1 month ago
config.json
Safe
488 Bytes
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5%
about 1 month ago
configuration_matilda.py
Safe
703 Bytes
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5%
about 1 month ago
conftest.py
Safe
92 Bytes
Matilda-Mini phases 1-5 + runbook
about 1 month ago
modeling_matilda.py
Safe
2.09 kB
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5%
about 1 month ago
pytest.ini
Safe
85 Bytes
Matilda-Mini phases 1-5 + runbook
about 1 month ago
requirements.txt
Safe
265 Bytes
second review fixes
about 1 month ago
run.py
Safe
2.95 kB
Matilda-Mini phases 1-5 + runbook
about 1 month ago
tokenizer.json
Safe
3.56 MB
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5%
about 1 month ago
tokenizer_config.json
Safe
315 Bytes
Add trained checkpoint: 3B tokens, loss=3.16, MFU=31.5%
about 1 month ago