| # Dingo d20 (All Intermediate Checkpoints) | |
| Repo: jayasuryajsk/Dingo | |
| Architecture: 20 layers, 10 heads, 10 KV heads, d_model=1280, seq_len=2048, vocab=65536 | |
| Each checkpoint is stored in: | |
| checkpoints/<step>/{model_<step>.pt, meta_<step>.json} | |
| Example eval (step 000650): | |
| - MMLU: 32.62 % | |
| - ARC-Easy: 44.82 % | |
| - ARC-Challenge: 31.14 % | |
| - GSM8K: 5.08 % | |
| - HumanEval: 6.71 % | |
| Load example (custom Dingo): | |
| import torch, json | |
| step="000650" | |
| base="checkpoints" | |
| ckpt = torch.load(f"{base}/{step}/model_{step}.pt", map_location="cpu") | |
| with open(f"{base}/{step}/meta_{step}.json") as f: | |
| meta = json.load(f) | |