Alfredvc/chess-autocomplete-v1

This repository contains one chess-autocomplete model variant staged for inference.

Variant

  • Repository: Alfredvc/chess-autocomplete-v1
  • Architecture: ChessTransformer
  • Dimensions: 768 hidden, 12 heads, 12 blocks
  • Maximum half moves: 600
  • Input representation: Discrete
  • Norm / MLP: layernorm / swiglu
  • Native input tokenizer: RealizableMoveTokenizer with 4169 ids
  • Native output tokenizer: RealizableMoveTokenizer with 4135 ids
  • Metadata: Metadata tokens are part of the input token stream.

Interface

This is a metadata-token model. Inputs must begin with the metadata prefix:

[time_control_token, white_elo_token, black_elo_token, GAME_START, ...moves]

Use TIME_CONTROL_MISSING_WORD and RATING_MISSING_WORD when metadata is not available.

The native PyTorch model returns logits over the output tokenizer vocabulary (4135 ids). The ONNX artifacts wrap that model and return bin_logits over raw 16-bit move words (65536 ids). These are different output interfaces.

PyTorch

import torch

from chess_autocomplete import protocol
from chess_autocomplete.huggingface import load_model_repo

loaded = load_model_repo(".")
raw_input = torch.tensor(
    [[
        protocol.TIME_CONTROL_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.GAME_START,
    ]],
    dtype=torch.long,
)
input_ids = loaded.input_tokenizer.batch_encode(raw_input)
logits, _ = loaded.model(input_ids)

The PyTorch weights are stored in model.safetensors and loaded strictly into chess_autocomplete.models.ChessTransformer.

ONNX Runtime

import numpy as np
import onnxruntime as ort

from chess_autocomplete import protocol

session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
bin_moves = np.asarray(
    [[
        protocol.TIME_CONTROL_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.RATING_MISSING_WORD,
        protocol.GAME_START,
    ]],
    dtype=np.int32,
)
bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]

Two ONNX files are published:

  • model.onnx: FP32 compatibility artifact.
  • model-bf16.onnx: BF16 floating-weight artifact for runtimes with BF16 operator support.

Both ONNX artifacts use the bin_logits_v1 interface: bin_moves input with shape [batch, time] and bin_logits output with shape [batch, 65536].

Converting Logits To Moves

The model predicts move tokens, not SAN strings. Do not take an unconstrained argmax over the full vocabulary. Score the legal moves in the current board position and choose from that legal set.

For PyTorch, logits are over the native output tokenizer vocabulary:

from chess_autocomplete.chess_utils import Board

board = Board()
# Apply any moves already played:
# board.push(chess.Move.from_uci("e2e4"))

next_logits = logits[0, -1]
legal = []
for move in board.board.legal_moves:
    raw_bin_word = board.encode(move)
    token_id = loaded.output_tokenizer.encode(raw_bin_word)
    legal.append((float(next_logits[token_id]), move))

score, best_move = max(legal, key=lambda item: item[0])
print(best_move.uci())

For ONNX bin_logits_v1, logits are already indexed by raw 16-bit move word:

from chess_autocomplete.chess_utils import Board

board = Board()
# Apply any moves already played:
# board.push(chess.Move.from_uci("e2e4"))

next_logits = bin_logits[0]
legal = []
for move in board.board.legal_moves:
    raw_bin_word = board.encode(move)
    legal.append((float(next_logits[raw_bin_word]), move))

score, best_move = max(legal, key=lambda item: item[0])
print(best_move.uci())

Call board.push(best_move) after selecting a move so the next prediction is decoded against the updated legal move set.

Validation

Artifact Validation Status Backend Precision Sample shape
model.safetensors write pass safetensors.torch.save_file
model.safetensors strict_load pass safetensors.torch.load_file
model.onnx export pass torch.onnx fp32 [2, 2]
model.onnx runtime pass onnxruntime.CPUExecutionProvider fp32 [2, 2]
model-bf16.onnx export pass torch.onnx bf16 [2, 2]
model-bf16.onnx onnx_checker_and_initializer_dtype pass onnx.checker bf16

Known Limitations

This model is trained for chess move autocomplete and is not a general chess engine. It does not include Transformers AutoModel or trust_remote_code support. Metadata-aware variants encode metadata as input tokens; no separate metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute the BF16 MatMul graph; use model.onnx for broad compatibility or model-bf16.onnx on a backend with BF16 operator support.

Downloads last month
64
Safetensors
Model size
91.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support