Add BF16 ONNX artifact

Browse files

Files changed (4) hide show

README.md +21 -10
config.json +9 -1
model-bf16.onnx +3 -0
model.onnx +2 -2

README.md CHANGED Viewed

@@ -36,8 +36,9 @@ Use `TIME_CONTROL_MISSING_WORD` and `RATING_MISSING_WORD` when metadata is not
 available.
 The native PyTorch model returns logits over the output tokenizer vocabulary
-(`4135` ids). The ONNX artifact wraps that model and returns `bin_logits` over
-raw 16-bit move words (`65536` ids). These are different output interfaces.
 ## PyTorch
@@ -85,7 +86,13 @@ bin_moves = np.asarray(
 bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
 ```
-The ONNX artifact uses the `bin_logits_v1` interface: `bin_moves` input with
 shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
 ## Converting Logits To Moves
@@ -138,16 +145,20 @@ decoded against the updated legal move set.
 ## Validation
-| Artifact | Validation | Status | Backend | Sample shape |
-| --- | --- | --- | --- | --- |
-| model.safetensors | write | pass | safetensors.torch.save_file |  |
-| model.safetensors | strict_load | pass | safetensors.torch.load_file |  |
-| model.onnx | export | pass | torch.onnx | [2, 2] |
-| model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | [2, 2] |
 ## Known Limitations
 This model is trained for chess move autocomplete and is not a general chess
 engine. It does not include Transformers `AutoModel` or `trust_remote_code`
 support. Metadata-aware variants encode metadata as input tokens; no separate
-metadata tensor path is supported.

 available.
 The native PyTorch model returns logits over the output tokenizer vocabulary
+(`4135` ids). The ONNX artifacts wrap that model and return
+`bin_logits` over raw 16-bit move words (`65536` ids). These are different output
+interfaces.
 ## PyTorch
 bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
 ```
+Two ONNX files are published:
+- `model.onnx`: FP32 compatibility artifact.
+- `model-bf16.onnx`: BF16 floating-weight artifact for runtimes with BF16
+  operator support.
+Both ONNX artifacts use the `bin_logits_v1` interface: `bin_moves` input with
 shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
 ## Converting Logits To Moves
 ## Validation
+| Artifact | Validation | Status | Backend | Precision | Sample shape |
+| --- | --- | --- | --- | --- | --- |
+| model.safetensors | write | pass | safetensors.torch.save_file |  |  |
+| model.safetensors | strict_load | pass | safetensors.torch.load_file |  |  |
+| model.onnx | export | pass | torch.onnx | fp32 | [2, 2] |
+| model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | fp32 | [2, 2] |
+| model-bf16.onnx | export | pass | torch.onnx | bf16 | [2, 2] |
+| model-bf16.onnx | onnx_checker_and_initializer_dtype | pass | onnx.checker | bf16 |  |
 ## Known Limitations
 This model is trained for chess move autocomplete and is not a general chess
 engine. It does not include Transformers `AutoModel` or `trust_remote_code`
 support. Metadata-aware variants encode metadata as input tokens; no separate
+metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute
+the BF16 MatMul graph; use `model.onnx` for broad compatibility or
+`model-bf16.onnx` on a backend with BF16 operator support.

config.json CHANGED Viewed

@@ -27,7 +27,15 @@
       "path": "model.onnx",
       "interface": "bin_logits_v1",
       "input_name": "bin_moves",
-      "output_name": "bin_logits"
     }
   },
   "source": {

       "path": "model.onnx",
       "interface": "bin_logits_v1",
       "input_name": "bin_moves",
+      "output_name": "bin_logits",
+      "precision": "fp32"
+    },
+    "onnx_bf16": {
+      "path": "model-bf16.onnx",
+      "interface": "bin_logits_v1",
+      "input_name": "bin_moves",
+      "output_name": "bin_logits",
+      "precision": "bf16"
     }
   },
   "source": {

model-bf16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2355ce5f6188aa7c23e0ccf5448918a642308c90fc5b8418fec48b8c2380c394
+size 185557324

model.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:41aa9c18eb3c82d9bf53ef9aa0cfbebd38fca673b39a2bac6d1377b5c8c33f42
-size 368328341

 version https://git-lfs.github.com/spec/v1
+oid sha256:92398c400eb26bf9b5ab07e6d01baab501983d9da14c908d76faab4f6af332db
+size 368328379