File size: 4,999 Bytes
9c220eb 22b7d27 9c220eb 22b7d27 81753e2 22b7d27 9c220eb 22b7d27 9c220eb 22b7d27 9c220eb 22b7d27 9c220eb 22b7d27 9c220eb 81753e2 9c220eb f96cfc0 9c220eb 81753e2 9c220eb 81753e2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | ---
library_name: chess-autocomplete
tags:
- chess
- pytorch
- safetensors
- onnx
license: apache-2.0
---
# Alfredvc/chess-autocomplete-v1
This repository contains one chess-autocomplete model variant staged for inference.
## Variant
- Repository: `Alfredvc/chess-autocomplete-v1`
- Architecture: `ChessTransformer`
- Dimensions: `768` hidden, `12` heads, `12` blocks
- Maximum half moves: `600`
- Input representation: `Discrete`
- Norm / MLP: `layernorm` / `swiglu`
- Native input tokenizer: `RealizableMoveTokenizer` with `4169` ids
- Native output tokenizer: `RealizableMoveTokenizer` with `4135` ids
- Metadata: Metadata tokens are part of the input token stream.
## Interface
This is a metadata-token model. Inputs must begin with the metadata prefix:
```text
[time_control_token, white_elo_token, black_elo_token, GAME_START, ...moves]
```
Use `TIME_CONTROL_MISSING_WORD` and `RATING_MISSING_WORD` when metadata is not
available.
The native PyTorch model returns logits over the output tokenizer vocabulary
(`4135` ids). The ONNX artifacts wrap that model and return
`bin_logits` over raw 16-bit move words (`65536` ids). These are different output
interfaces.
## PyTorch
```python
import torch
from chess_autocomplete import protocol
from chess_autocomplete.huggingface import load_model_repo
loaded = load_model_repo(".")
raw_input = torch.tensor(
[[
protocol.TIME_CONTROL_MISSING_WORD,
protocol.RATING_MISSING_WORD,
protocol.RATING_MISSING_WORD,
protocol.GAME_START,
]],
dtype=torch.long,
)
input_ids = loaded.input_tokenizer.batch_encode(raw_input)
logits, _ = loaded.model(input_ids)
```
The PyTorch weights are stored in `model.safetensors` and loaded strictly into
`chess_autocomplete.models.ChessTransformer`.
## ONNX Runtime
```python
import numpy as np
import onnxruntime as ort
from chess_autocomplete import protocol
session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
bin_moves = np.asarray(
[[
protocol.TIME_CONTROL_MISSING_WORD,
protocol.RATING_MISSING_WORD,
protocol.RATING_MISSING_WORD,
protocol.GAME_START,
]],
dtype=np.int32,
)
bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
```
Two ONNX files are published:
- `model.onnx`: FP32 compatibility artifact.
- `model-bf16.onnx`: BF16 floating-weight artifact for runtimes with BF16
operator support.
Both ONNX artifacts use the `bin_logits_v1` interface: `bin_moves` input with
shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
## Converting Logits To Moves
The model predicts move tokens, not SAN strings. Do not take an unconstrained
argmax over the full vocabulary. Score the legal moves in the current board
position and choose from that legal set.
For PyTorch, logits are over the native output tokenizer vocabulary:
```python
from chess_autocomplete.chess_utils import Board
board = Board()
# Apply any moves already played:
# board.push(chess.Move.from_uci("e2e4"))
next_logits = logits[0, -1]
legal = []
for move in board.board.legal_moves:
raw_bin_word = board.encode(move)
token_id = loaded.output_tokenizer.encode(raw_bin_word)
legal.append((float(next_logits[token_id]), move))
score, best_move = max(legal, key=lambda item: item[0])
print(best_move.uci())
```
For ONNX `bin_logits_v1`, logits are already indexed by raw 16-bit move word:
```python
from chess_autocomplete.chess_utils import Board
board = Board()
# Apply any moves already played:
# board.push(chess.Move.from_uci("e2e4"))
next_logits = bin_logits[0]
legal = []
for move in board.board.legal_moves:
raw_bin_word = board.encode(move)
legal.append((float(next_logits[raw_bin_word]), move))
score, best_move = max(legal, key=lambda item: item[0])
print(best_move.uci())
```
Call `board.push(best_move)` after selecting a move so the next prediction is
decoded against the updated legal move set.
## Validation
| Artifact | Validation | Status | Backend | Precision | Sample shape |
| --- | --- | --- | --- | --- | --- |
| model.safetensors | write | pass | safetensors.torch.save_file | | |
| model.safetensors | strict_load | pass | safetensors.torch.load_file | | |
| model.onnx | export | pass | torch.onnx | fp32 | [2, 2] |
| model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | fp32 | [2, 2] |
| model-bf16.onnx | export | pass | torch.onnx | bf16 | [2, 2] |
| model-bf16.onnx | onnx_checker_and_initializer_dtype | pass | onnx.checker | bf16 | |
## Known Limitations
This model is trained for chess move autocomplete and is not a general chess
engine. It does not include Transformers `AutoModel` or `trust_remote_code`
support. Metadata-aware variants encode metadata as input tokens; no separate
metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute
the BF16 MatMul graph; use `model.onnx` for broad compatibility or
`model-bf16.onnx` on a backend with BF16 operator support.
|