thomas-schweich
/

pawn-small

next-token-prediction

representation-learning

Eval Results (legacy)

Model card Files Files and versions

thomas-schweich commited on 27 days ago

Commit

ac274df

·

verified ·

1 Parent(s): 1da8797

Upload folder using huggingface_hub

Files changed (2) hide show

README.md +66 -0
model.pt +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+license: apache-2.0
+library_name: pytorch
+tags:
+  - chess
+  - transformer
+  - causal-lm
+  - world-model
+datasets:
+  - random-self-play
+model-index:
+  - name: pawn-small
+    results:
+      - task:
+          type: next-move-prediction
+        metrics:
+          - name: Val Loss
+            type: loss
+            value: 3.15
+          - name: Val Accuracy
+            type: accuracy
+            value: 6.7
+---
+# PAWN-SMALL
+A causal transformer trained on random chess games, designed as a testbed for finetuning and augmentation methods at small scales.
+## Model Details
+| | |
+|---|---|
+| **Parameters** | 9.5M |
+| **Architecture** | Decoder-only transformer (RMSNorm, SwiGLU, RoPE) |
+| **d_model** | 256 |
+| **Layers** | 8 |
+| **Heads** | 4 |
+| **Vocabulary** | 4,278 tokens (4,096 grid + 176 promotions + 5 outcomes + 1 PAD) |
+| **Sequence length** | 256 |
+| **Training steps** | 80K/100K |
+| **Best val loss** | 3.150 (step 80,000) |
+| **Best val accuracy** | 6.7% |
+## Usage
+```python
+import torch
+from pawn.config import CLMConfig
+from pawn.model import PAWNCLM
+cfg = CLMConfig.small()
+model = PAWNCLM(cfg)
+ckpt = torch.load("model.pt", map_location="cpu", weights_only=False)
+model.load_state_dict(ckpt["model_state_dict"])
+model.eval()
+```
+## Training
+Trained from scratch on random self-play games generated by a Rust chess engine (shakmaty).
+See the [PAWN repository](https://github.com/thomas-schweich/PAWN) for training code, data pipeline, and evaluation suite.
+## License
+Apache 2.0

model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b6bc05855065923f2f8406834b6ad23c118fa63fb892e883b7641029761ac278
+size 114390171