sasha-smirnov commited on
Commit
0cdbca6
·
verified ·
1 Parent(s): 726a8b3

Initial publish via teradata-opus-translate

Browse files
Files changed (2) hide show
  1. README.md +5 -2
  2. onnx/model-int8.onnx +3 -0
README.md CHANGED
@@ -37,6 +37,8 @@ packaged for use with the Teradata `mldb.ONNXSeq2Seq` BYOM function.
37
  - `config.json` — model architecture metadata, copied unchanged from the upstream repo
38
  - `generation_config.json` — generation defaults, copied unchanged from the upstream repo
39
 
 
 
40
  For the original PyTorch weights and training details, see the upstream model:
41
  **[Helsinki-NLP/opus-mt_tiny_eng-tur](https://huggingface.co/Helsinki-NLP/opus-mt_tiny_eng-tur)**.
42
 
@@ -49,7 +51,7 @@ For the original PyTorch weights and training details, see the upstream model:
49
  | Architecture | MarianMT (encoder-decoder) |
50
  | Max input tokens | 256 |
51
  | Max output tokens | 512 |
52
- | ONNX file size | 177 MB |
53
  | ONNX opset | 14 |
54
  | ONNX IR version | 8 (BYOM 7.0+ compatible) |
55
  | License | Apache-2.0 (from upstream) |
@@ -125,7 +127,8 @@ print(tdml.DataFrame.from_query(query))
125
  This model was produced with the open-source
126
  [`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
127
  package, which exports the encoder/decoder, stitches in the BeamSearch op,
128
- and verifies parity against PyTorch on a small sample set.
 
129
 
130
  > **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
131
  > (including ones not in this collection) to a BYOM-ready ONNX bundle. If
 
37
  - `config.json` — model architecture metadata, copied unchanged from the upstream repo
38
  - `generation_config.json` — generation defaults, copied unchanged from the upstream repo
39
 
40
+ A weight-only int8-quantized variant is also published as `onnx/model-int8.onnx`. Use `model-fp32.onnx` unless deployment size is a constraint.
41
+
42
  For the original PyTorch weights and training details, see the upstream model:
43
  **[Helsinki-NLP/opus-mt_tiny_eng-tur](https://huggingface.co/Helsinki-NLP/opus-mt_tiny_eng-tur)**.
44
 
 
51
  | Architecture | MarianMT (encoder-decoder) |
52
  | Max input tokens | 256 |
53
  | Max output tokens | 512 |
54
+ | ONNX file size | 177 MB (fp32) / 95 MB (int8) |
55
  | ONNX opset | 14 |
56
  | ONNX IR version | 8 (BYOM 7.0+ compatible) |
57
  | License | Apache-2.0 (from upstream) |
 
127
  This model was produced with the open-source
128
  [`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
129
  package, which exports the encoder/decoder, stitches in the BeamSearch op,
130
+ applies weight-only int8 quantization, and verifies parity against PyTorch on a
131
+ small sample set.
132
 
133
  > **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
134
  > (including ones not in this collection) to a BYOM-ready ONNX bundle. If
onnx/model-int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a517d628f907e5261cacf2d797460088303fd485955598872a1741491a78e7d8
3
+ size 94727718