File size: 4,999 Bytes
9c220eb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22b7d27
 
9c220eb
 
22b7d27
 
 
 
 
 
 
 
 
 
 
 
81753e2
 
 
22b7d27
9c220eb
 
 
22b7d27
 
 
9c220eb
 
 
22b7d27
 
 
 
 
 
 
 
 
 
 
9c220eb
 
 
 
 
 
 
 
 
 
 
22b7d27
 
9c220eb
22b7d27
 
 
 
 
 
 
 
 
9c220eb
 
 
81753e2
 
 
 
 
 
 
9c220eb
 
f96cfc0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9c220eb
 
81753e2
 
 
 
 
 
 
 
9c220eb
 
 
 
 
 
81753e2
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
library_name: chess-autocomplete
tags:
- chess
- pytorch
- safetensors
- onnx
license: apache-2.0
---

# Alfredvc/chess-autocomplete-v1

This repository contains one chess-autocomplete model variant staged for inference.

## Variant

- Repository: `Alfredvc/chess-autocomplete-v1`
- Architecture: `ChessTransformer`
- Dimensions: `768` hidden, `12` heads, `12` blocks
- Maximum half moves: `600`
- Input representation: `Discrete`
- Norm / MLP: `layernorm` / `swiglu`
- Native input tokenizer: `RealizableMoveTokenizer` with `4169` ids
- Native output tokenizer: `RealizableMoveTokenizer` with `4135` ids
- Metadata: Metadata tokens are part of the input token stream.

## Interface

This is a metadata-token model. Inputs must begin with the metadata prefix:

```text
[time_control_token, white_elo_token, black_elo_token, GAME_START, ...moves]
```

Use `TIME_CONTROL_MISSING_WORD` and `RATING_MISSING_WORD` when metadata is not
available.

The native PyTorch model returns logits over the output tokenizer vocabulary
(`4135` ids). The ONNX artifacts wrap that model and return
`bin_logits` over raw 16-bit move words (`65536` ids). These are different output
interfaces.

## PyTorch

```python
import torch

from chess_autocomplete import protocol
from chess_autocomplete.huggingface import load_model_repo

loaded = load_model_repo(".")
raw_input = torch.tensor(
    [[
        protocol.TIME_CONTROL_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.GAME_START,
    ]],
    dtype=torch.long,
)
input_ids = loaded.input_tokenizer.batch_encode(raw_input)
logits, _ = loaded.model(input_ids)
```

The PyTorch weights are stored in `model.safetensors` and loaded strictly into
`chess_autocomplete.models.ChessTransformer`.

## ONNX Runtime

```python
import numpy as np
import onnxruntime as ort

from chess_autocomplete import protocol

session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
bin_moves = np.asarray(
    [[
        protocol.TIME_CONTROL_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.GAME_START,
    ]],
    dtype=np.int32,
)
bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
```

Two ONNX files are published:

- `model.onnx`: FP32 compatibility artifact.
- `model-bf16.onnx`: BF16 floating-weight artifact for runtimes with BF16
  operator support.

Both ONNX artifacts use the `bin_logits_v1` interface: `bin_moves` input with
shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.

## Converting Logits To Moves

The model predicts move tokens, not SAN strings. Do not take an unconstrained
argmax over the full vocabulary. Score the legal moves in the current board
position and choose from that legal set.

For PyTorch, logits are over the native output tokenizer vocabulary:

```python
from chess_autocomplete.chess_utils import Board

board = Board()
# Apply any moves already played:
# board.push(chess.Move.from_uci("e2e4"))

next_logits = logits[0, -1]
legal = []
for move in board.board.legal_moves:
    raw_bin_word = board.encode(move)
    token_id = loaded.output_tokenizer.encode(raw_bin_word)
    legal.append((float(next_logits[token_id]), move))

score, best_move = max(legal, key=lambda item: item[0])
print(best_move.uci())
```

For ONNX `bin_logits_v1`, logits are already indexed by raw 16-bit move word:

```python
from chess_autocomplete.chess_utils import Board

board = Board()
# Apply any moves already played:
# board.push(chess.Move.from_uci("e2e4"))

next_logits = bin_logits[0]
legal = []
for move in board.board.legal_moves:
    raw_bin_word = board.encode(move)
    legal.append((float(next_logits[raw_bin_word]), move))

score, best_move = max(legal, key=lambda item: item[0])
print(best_move.uci())
```

Call `board.push(best_move)` after selecting a move so the next prediction is
decoded against the updated legal move set.

## Validation

| Artifact | Validation | Status | Backend | Precision | Sample shape |
| --- | --- | --- | --- | --- | --- |
| model.safetensors | write | pass | safetensors.torch.save_file |  |  |
| model.safetensors | strict_load | pass | safetensors.torch.load_file |  |  |
| model.onnx | export | pass | torch.onnx | fp32 | [2, 2] |
| model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | fp32 | [2, 2] |
| model-bf16.onnx | export | pass | torch.onnx | bf16 | [2, 2] |
| model-bf16.onnx | onnx_checker_and_initializer_dtype | pass | onnx.checker | bf16 |  |

## Known Limitations

This model is trained for chess move autocomplete and is not a general chess
engine. It does not include Transformers `AutoModel` or `trust_remote_code`
support. Metadata-aware variants encode metadata as input tokens; no separate
metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute
the BF16 MatMul graph; use `model.onnx` for broad compatibility or
`model-bf16.onnx` on a backend with BF16 operator support.