Alfredvc commited on
Commit
81753e2
·
verified ·
1 Parent(s): f96cfc0

Add BF16 ONNX artifact

Browse files
Files changed (4) hide show
  1. README.md +21 -10
  2. config.json +9 -1
  3. model-bf16.onnx +3 -0
  4. model.onnx +2 -2
README.md CHANGED
@@ -36,8 +36,9 @@ Use `TIME_CONTROL_MISSING_WORD` and `RATING_MISSING_WORD` when metadata is not
36
  available.
37
 
38
  The native PyTorch model returns logits over the output tokenizer vocabulary
39
- (`4135` ids). The ONNX artifact wraps that model and returns `bin_logits` over
40
- raw 16-bit move words (`65536` ids). These are different output interfaces.
 
41
 
42
  ## PyTorch
43
 
@@ -85,7 +86,13 @@ bin_moves = np.asarray(
85
  bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
86
  ```
87
 
88
- The ONNX artifact uses the `bin_logits_v1` interface: `bin_moves` input with
 
 
 
 
 
 
89
  shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
90
 
91
  ## Converting Logits To Moves
@@ -138,16 +145,20 @@ decoded against the updated legal move set.
138
 
139
  ## Validation
140
 
141
- | Artifact | Validation | Status | Backend | Sample shape |
142
- | --- | --- | --- | --- | --- |
143
- | model.safetensors | write | pass | safetensors.torch.save_file | |
144
- | model.safetensors | strict_load | pass | safetensors.torch.load_file | |
145
- | model.onnx | export | pass | torch.onnx | [2, 2] |
146
- | model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | [2, 2] |
 
 
147
 
148
  ## Known Limitations
149
 
150
  This model is trained for chess move autocomplete and is not a general chess
151
  engine. It does not include Transformers `AutoModel` or `trust_remote_code`
152
  support. Metadata-aware variants encode metadata as input tokens; no separate
153
- metadata tensor path is supported.
 
 
 
36
  available.
37
 
38
  The native PyTorch model returns logits over the output tokenizer vocabulary
39
+ (`4135` ids). The ONNX artifacts wrap that model and return
40
+ `bin_logits` over raw 16-bit move words (`65536` ids). These are different output
41
+ interfaces.
42
 
43
  ## PyTorch
44
 
 
86
  bin_logits = session.run(["bin_logits"], {"bin_moves": bin_moves})[0]
87
  ```
88
 
89
+ Two ONNX files are published:
90
+
91
+ - `model.onnx`: FP32 compatibility artifact.
92
+ - `model-bf16.onnx`: BF16 floating-weight artifact for runtimes with BF16
93
+ operator support.
94
+
95
+ Both ONNX artifacts use the `bin_logits_v1` interface: `bin_moves` input with
96
  shape `[batch, time]` and `bin_logits` output with shape `[batch, 65536]`.
97
 
98
  ## Converting Logits To Moves
 
145
 
146
  ## Validation
147
 
148
+ | Artifact | Validation | Status | Backend | Precision | Sample shape |
149
+ | --- | --- | --- | --- | --- | --- |
150
+ | model.safetensors | write | pass | safetensors.torch.save_file | | |
151
+ | model.safetensors | strict_load | pass | safetensors.torch.load_file | | |
152
+ | model.onnx | export | pass | torch.onnx | fp32 | [2, 2] |
153
+ | model.onnx | runtime | pass | onnxruntime.CPUExecutionProvider | fp32 | [2, 2] |
154
+ | model-bf16.onnx | export | pass | torch.onnx | bf16 | [2, 2] |
155
+ | model-bf16.onnx | onnx_checker_and_initializer_dtype | pass | onnx.checker | bf16 | |
156
 
157
  ## Known Limitations
158
 
159
  This model is trained for chess move autocomplete and is not a general chess
160
  engine. It does not include Transformers `AutoModel` or `trust_remote_code`
161
  support. Metadata-aware variants encode metadata as input tokens; no separate
162
+ metadata tensor path is supported. Some ONNX Runtime CPU builds do not execute
163
+ the BF16 MatMul graph; use `model.onnx` for broad compatibility or
164
+ `model-bf16.onnx` on a backend with BF16 operator support.
config.json CHANGED
@@ -27,7 +27,15 @@
27
  "path": "model.onnx",
28
  "interface": "bin_logits_v1",
29
  "input_name": "bin_moves",
30
- "output_name": "bin_logits"
 
 
 
 
 
 
 
 
31
  }
32
  },
33
  "source": {
 
27
  "path": "model.onnx",
28
  "interface": "bin_logits_v1",
29
  "input_name": "bin_moves",
30
+ "output_name": "bin_logits",
31
+ "precision": "fp32"
32
+ },
33
+ "onnx_bf16": {
34
+ "path": "model-bf16.onnx",
35
+ "interface": "bin_logits_v1",
36
+ "input_name": "bin_moves",
37
+ "output_name": "bin_logits",
38
+ "precision": "bf16"
39
  }
40
  },
41
  "source": {
model-bf16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2355ce5f6188aa7c23e0ccf5448918a642308c90fc5b8418fec48b8c2380c394
3
+ size 185557324
model.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:41aa9c18eb3c82d9bf53ef9aa0cfbebd38fca673b39a2bac6d1377b5c8c33f42
3
- size 368328341
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:92398c400eb26bf9b5ab07e6d01baab501983d9da14c908d76faab4f6af332db
3
+ size 368328379