Add BF16 ONNX artifact
Browse files- README.md +21 -10
- config.json +9 -1
- model-bf16.onnx +3 -0
- model.onnx +2 -2
README.md
CHANGED
|
@@ -36,8 +36,9 @@ Use `TIME_CONTROL_MISSING_WORD` and `RATING_MISSING_WORD` when metadata is not
|
|
| 36 |
available.
|
| 37 |
|
| 38 |
The native PyTorch model returns logits over the output tokenizer vocabulary
|
| 39 |
-
(`4135` ids). The ONNX
|
| 40 |
-
raw 16-bit move words (`65536` ids). These are different output
|
|
|
|
| 41 |
|
| 42 |
## PyTorch
|
| 43 |
|
|
@@ -85,7 +86,13 @@ bin_moves = np.asarray(
|
|
| 85 |
bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
|
| 86 |
```
|
| 87 |
|
| 88 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
|
| 90 |
|
| 91 |
## Converting Logits To Moves
|
|
@@ -138,16 +145,20 @@ decoded against the updated legal move set.
|
|
| 138 |
|
| 139 |
## Validation
|
| 140 |
|
| 141 |
-
| Artifact | Validation | Status | Backend | Sample shape |
|
| 142 |
-
| --- | --- | --- | --- | --- |
|
| 143 |
-
| model.safetensors | write | pass | safetensors.torch.save_file | |
|
| 144 |
-
| model.safetensors | strict_load | pass | safetensors.torch.load_file | |
|
| 145 |
-
| model.onnx | export | pass | torch.onnx | [2, 2] |
|
| 146 |
-
| model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | [2, 2] |
|
|
|
|
|
|
|
| 147 |
|
| 148 |
## Known Limitations
|
| 149 |
|
| 150 |
This model is trained for chess move autocomplete and is not a general chess
|
| 151 |
engine. It does not include Transformers `AutoModel` or `trust_remote_code`
|
| 152 |
support. Metadata-aware variants encode metadata as input tokens; no separate
|
| 153 |
-
metadata tensor path is supported.
|
|
|
|
|
|
|
|
|
| 36 |
available.
|
| 37 |
|
| 38 |
The native PyTorch model returns logits over the output tokenizer vocabulary
|
| 39 |
+
(`4135` ids). The ONNX artifacts wrap that model and return
|
| 40 |
+
`bin_logits` over raw 16-bit move words (`65536` ids). These are different output
|
| 41 |
+
interfaces.
|
| 42 |
|
| 43 |
## PyTorch
|
| 44 |
|
|
|
|
| 86 |
bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
|
| 87 |
```
|
| 88 |
|
| 89 |
+
Two ONNX files are published:
|
| 90 |
+
|
| 91 |
+
- `model.onnx`: FP32 compatibility artifact.
|
| 92 |
+
- `model-bf16.onnx`: BF16 floating-weight artifact for runtimes with BF16
|
| 93 |
+
operator support.
|
| 94 |
+
|
| 95 |
+
Both ONNX artifacts use the `bin_logits_v1` interface: `bin_moves` input with
|
| 96 |
shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
|
| 97 |
|
| 98 |
## Converting Logits To Moves
|
|
|
|
| 145 |
|
| 146 |
## Validation
|
| 147 |
|
| 148 |
+
| Artifact | Validation | Status | Backend | Precision | Sample shape |
|
| 149 |
+
| --- | --- | --- | --- | --- | --- |
|
| 150 |
+
| model.safetensors | write | pass | safetensors.torch.save_file | | |
|
| 151 |
+
| model.safetensors | strict_load | pass | safetensors.torch.load_file | | |
|
| 152 |
+
| model.onnx | export | pass | torch.onnx | fp32 | [2, 2] |
|
| 153 |
+
| model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | fp32 | [2, 2] |
|
| 154 |
+
| model-bf16.onnx | export | pass | torch.onnx | bf16 | [2, 2] |
|
| 155 |
+
| model-bf16.onnx | onnx_checker_and_initializer_dtype | pass | onnx.checker | bf16 | |
|
| 156 |
|
| 157 |
## Known Limitations
|
| 158 |
|
| 159 |
This model is trained for chess move autocomplete and is not a general chess
|
| 160 |
engine. It does not include Transformers `AutoModel` or `trust_remote_code`
|
| 161 |
support. Metadata-aware variants encode metadata as input tokens; no separate
|
| 162 |
+
metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute
|
| 163 |
+
the BF16 MatMul graph; use `model.onnx` for broad compatibility or
|
| 164 |
+
`model-bf16.onnx` on a backend with BF16 operator support.
|
config.json
CHANGED
|
@@ -27,7 +27,15 @@
|
|
| 27 |
"path": "model.onnx",
|
| 28 |
"interface": "bin_logits_v1",
|
| 29 |
"input_name": "bin_moves",
|
| 30 |
-
"output_name": "bin_logits"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
}
|
| 32 |
},
|
| 33 |
"source": {
|
|
|
|
| 27 |
"path": "model.onnx",
|
| 28 |
"interface": "bin_logits_v1",
|
| 29 |
"input_name": "bin_moves",
|
| 30 |
+
"output_name": "bin_logits",
|
| 31 |
+
"precision": "fp32"
|
| 32 |
+
},
|
| 33 |
+
"onnx_bf16": {
|
| 34 |
+
"path": "model-bf16.onnx",
|
| 35 |
+
"interface": "bin_logits_v1",
|
| 36 |
+
"input_name": "bin_moves",
|
| 37 |
+
"output_name": "bin_logits",
|
| 38 |
+
"precision": "bf16"
|
| 39 |
}
|
| 40 |
},
|
| 41 |
"source": {
|
model-bf16.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2355ce5f6188aa7c23e0ccf5448918a642308c90fc5b8418fec48b8c2380c394
|
| 3 |
+
size 185557324
|
model.onnx
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:92398c400eb26bf9b5ab07e6d01baab501983d9da14c908d76faab4f6af332db
|
| 3 |
+
size 368328379
|