--- library_name: chess-autocomplete tags: - chess - pytorch - safetensors - onnx license: apache-2.0 --- # Alfredvc/chess-autocomplete-v1 This repository contains one chess-autocomplete model variant staged for inference. ## Variant - Repository: `Alfredvc/chess-autocomplete-v1` - Architecture: `ChessTransformer` - Dimensions: `768` hidden, `12` heads, `12` blocks - Maximum half moves: `600` - Input representation: `Discrete` - Norm / MLP: `layernorm` / `swiglu` - Native input tokenizer: `RealizableMoveTokenizer` with `4169` ids - Native output tokenizer: `RealizableMoveTokenizer` with `4135` ids - Metadata: Metadata tokens are part of the input token stream. ## Interface This is a metadata-token model. Inputs must begin with the metadata prefix: ```text [time_control_token, white_elo_token, black_elo_token, GAME_START, ...moves] ``` Use `TIME_CONTROL_MISSING_WORD` and `RATING_MISSING_WORD` when metadata is not available. The native PyTorch model returns logits over the output tokenizer vocabulary (`4135` ids). The ONNX artifacts wrap that model and return `bin_logits` over raw 16-bit move words (`65536` ids). These are different output interfaces. ## PyTorch ```python import torch from chess_autocomplete import protocol from chess_autocomplete.huggingface import load_model_repo loaded = load_model_repo(".") raw_input = torch.tensor( [[ protocol.TIME_CONTROL_MISSING_WORD, protocol.RATING_MISSING_WORD, protocol.RATING_MISSING_WORD, protocol.GAME_START, ]], dtype=torch.long, ) input_ids = loaded.input_tokenizer.batch_encode(raw_input) logits, _ = loaded.model(input_ids) ``` The PyTorch weights are stored in `model.safetensors` and loaded strictly into `chess_autocomplete.models.ChessTransformer`. ## ONNX Runtime ```python import numpy as np import onnxruntime as ort from chess_autocomplete import protocol session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"]) bin_moves = np.asarray( [[ protocol.TIME_CONTROL_MISSING_WORD, protocol.RATING_MISSING_WORD, protocol.RATING_MISSING_WORD, protocol.GAME_START, ]], dtype=np.int32, ) bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0] ``` Two ONNX files are published: - `model.onnx`: FP32 compatibility artifact. - `model-bf16.onnx`: BF16 floating-weight artifact for runtimes with BF16 operator support. Both ONNX artifacts use the `bin_logits_v1` interface: `bin_moves` input with shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`. ## Converting Logits To Moves The model predicts move tokens, not SAN strings. Do not take an unconstrained argmax over the full vocabulary. Score the legal moves in the current board position and choose from that legal set. For PyTorch, logits are over the native output tokenizer vocabulary: ```python from chess_autocomplete.chess_utils import Board board = Board() # Apply any moves already played: # board.push(chess.Move.from_uci("e2e4")) next_logits = logits[0, -1] legal = [] for move in board.board.legal_moves: raw_bin_word = board.encode(move) token_id = loaded.output_tokenizer.encode(raw_bin_word) legal.append((float(next_logits[token_id]), move)) score, best_move = max(legal, key=lambda item: item[0]) print(best_move.uci()) ``` For ONNX `bin_logits_v1`, logits are already indexed by raw 16-bit move word: ```python from chess_autocomplete.chess_utils import Board board = Board() # Apply any moves already played: # board.push(chess.Move.from_uci("e2e4")) next_logits = bin_logits[0] legal = [] for move in board.board.legal_moves: raw_bin_word = board.encode(move) legal.append((float(next_logits[raw_bin_word]), move)) score, best_move = max(legal, key=lambda item: item[0]) print(best_move.uci()) ``` Call `board.push(best_move)` after selecting a move so the next prediction is decoded against the updated legal move set. ## Validation | Artifact | Validation | Status | Backend | Precision | Sample shape | | --- | --- | --- | --- | --- | --- | | model.safetensors | write | pass | safetensors.torch.save_file | | | | model.safetensors | strict_load | pass | safetensors.torch.load_file | | | | model.onnx | export | pass | torch.onnx | fp32 | [2, 2] | | model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | fp32 | [2, 2] | | model-bf16.onnx | export | pass | torch.onnx | bf16 | [2, 2] | | model-bf16.onnx | onnx_checker_and_initializer_dtype | pass | onnx.checker | bf16 | | ## Known Limitations This model is trained for chess move autocomplete and is not a general chess engine. It does not include Transformers `AutoModel` or `trust_remote_code` support. Metadata-aware variants encode metadata as input tokens; no separate metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute the BF16 MatMul graph; use `model.onnx` for broad compatibility or `model-bf16.onnx` on a backend with BF16 operator support.