philipp-zettl
/

chessPT

Text Generation

Model card Files Files and versions

philipp-zettl commited on Oct 1, 2024

Commit

ed08b45

·

verified ·

1 Parent(s): 094f3cb

Update README.md

Files changed (1) hide show

README.md +32 -3

README.md CHANGED Viewed

@@ -1,3 +1,32 @@
----
-license: cc0-1.0
----

+---
+license: cc0-1.0
+datasets:
+- Lichess/standard-chess-games
+pipeline_tag: text2text-generation
+tags:
+- chess
+---
+# Model card for chessPT
+A pretrained Decoder only transformer model for chess move prediction.
+## Intended use
+Predict new moves in a chess game based on PGN tokens.
+## Implementation
+The model implementation is based on Andrej Karpathy's [nanoGPT](https://github.com/karpathy/nanoGPT) following the webseries "Zero to Hero" on [youtube](https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ).
+## Training
+You can find the training script in the repositories files under `train.py`.
+This also contains the used parameters
+```python
+context_size = 256
+batch_size = 128
+max_iters = 30_000
+learning_rate = 3e-5
+eval_interval = 100
+eval_iters = 20
+n_embed = 384
+n_layer = 6
+n_head = 6
+dropout = 0.2
+```