sasha-smirnov commited on
Commit
98cfe54
·
verified ·
1 Parent(s): 143abf9

Initial publish via teradata-opus-translate

Browse files
Files changed (1) hide show
  1. README.md +2 -6
README.md CHANGED
@@ -33,7 +33,6 @@ packaged for use with the Teradata `mldb.ONNXSeq2Seq` BYOM function.
33
  **This repository does not redistribute the original model weights.** It contains only:
34
 
35
  - `onnx/model-fp32.onnx` — full-precision ONNX graph
36
- - `onnx/model-int8.onnx` — dynamically quantized ONNX graph
37
  - `tokenizer.json` — repacked Marian tokenizer suitable for BYOM
38
  - `config.json` — model architecture metadata, copied unchanged from the upstream repo
39
  - `generation_config.json` — generation defaults, copied unchanged from the upstream repo
@@ -50,7 +49,7 @@ For the original PyTorch weights and training details, see the upstream model:
50
  | Architecture | MarianMT (encoder-decoder) |
51
  | Max input tokens | 256 |
52
  | Max output tokens | 512 |
53
- | ONNX file sizes | fp32 (177 MB), int8 (94 MB) |
54
  | ONNX opset | 14 |
55
  | ONNX IR version | 8 (BYOM 7.0+ compatible) |
56
  | License | Apache-2.0 (from upstream) |
@@ -121,15 +120,12 @@ FROM mldb.ONNXSeq2Seq(
121
  print(tdml.DataFrame.from_query(query))
122
  ```
123
 
124
- An int8-quantized variant is also published as `onnx/model-int8.onnx`. The int8 variant does not accept `num_beams` (configured internally).
125
-
126
  ## How this model was converted
127
 
128
  This model was produced with the open-source
129
  [`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
130
  package, which exports the encoder/decoder, stitches in the BeamSearch op,
131
- applies dynamic int8 quantization, and verifies parity against PyTorch on a
132
- small sample set.
133
 
134
  > **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
135
  > (including ones not in this collection) to a BYOM-ready ONNX bundle. If
 
33
  **This repository does not redistribute the original model weights.** It contains only:
34
 
35
  - `onnx/model-fp32.onnx` — full-precision ONNX graph
 
36
  - `tokenizer.json` — repacked Marian tokenizer suitable for BYOM
37
  - `config.json` — model architecture metadata, copied unchanged from the upstream repo
38
  - `generation_config.json` — generation defaults, copied unchanged from the upstream repo
 
49
  | Architecture | MarianMT (encoder-decoder) |
50
  | Max input tokens | 256 |
51
  | Max output tokens | 512 |
52
+ | ONNX file size | 177 MB |
53
  | ONNX opset | 14 |
54
  | ONNX IR version | 8 (BYOM 7.0+ compatible) |
55
  | License | Apache-2.0 (from upstream) |
 
120
  print(tdml.DataFrame.from_query(query))
121
  ```
122
 
 
 
123
  ## How this model was converted
124
 
125
  This model was produced with the open-source
126
  [`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
127
  package, which exports the encoder/decoder, stitches in the BeamSearch op,
128
+ and verifies parity against PyTorch on a small sample set.
 
129
 
130
  > **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
131
  > (including ones not in this collection) to a BYOM-ready ONNX bundle. If