Dingo-chat-560M / README.md
jayasuryajsk's picture
Upload folder using huggingface_hub
9f5367b verified
# Dingo d20 (All Intermediate Checkpoints)
Repo: jayasuryajsk/Dingo
Architecture: 20 layers, 10 heads, 10 KV heads, d_model=1280, seq_len=2048, vocab=65536
Each checkpoint is stored in:
checkpoints/<step>/{model_<step>.pt, meta_<step>.json}
Example eval (step 000650):
- MMLU: 32.62 %
- ARC-Easy: 44.82 %
- ARC-Challenge: 31.14 %
- GSM8K: 5.08 %
- HumanEval: 6.71 %
Load example (custom Dingo):
import torch, json
step="000650"
base="checkpoints"
ckpt = torch.load(f"{base}/{step}/model_{step}.pt", map_location="cpu")
with open(f"{base}/{step}/meta_{step}.json") as f:
meta = json.load(f)