Teradata
/

opus-mt_tiny_eng-tur

text2text-generation

Helsinki-NLP/tatoeba

openlanguagedata/flores_plus

Model card Files Files and versions

sasha-smirnov commited on 3 days ago

Commit

0cdbca6

·

verified ·

1 Parent(s): 726a8b3

Initial publish via teradata-opus-translate

Files changed (2) hide show

README.md +5 -2
onnx/model-int8.onnx +3 -0

README.md CHANGED Viewed

@@ -37,6 +37,8 @@ packaged for use with the Teradata `mldb.ONNXSeq2Seq` BYOM function.
 - `config.json` — model architecture metadata, copied unchanged from the upstream repo
 - `generation_config.json` — generation defaults, copied unchanged from the upstream repo
 For the original PyTorch weights and training details, see the upstream model:
 **[Helsinki-NLP/opus-mt_tiny_eng-tur](https://huggingface.co/Helsinki-NLP/opus-mt_tiny_eng-tur)**.
@@ -49,7 +51,7 @@ For the original PyTorch weights and training details, see the upstream model:
 | Architecture | MarianMT (encoder-decoder) |
 | Max input tokens | 256 |
 | Max output tokens | 512 |
-| ONNX file size | 177 MB |
 | ONNX opset | 14 |
 | ONNX IR version | 8 (BYOM 7.0+ compatible) |
 | License | Apache-2.0 (from upstream) |
@@ -125,7 +127,8 @@ print(tdml.DataFrame.from_query(query))
 This model was produced with the open-source
 [`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
 package, which exports the encoder/decoder, stitches in the BeamSearch op,
-and verifies parity against PyTorch on a small sample set.
 > **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
 > (including ones not in this collection) to a BYOM-ready ONNX bundle. If

 - `config.json` — model architecture metadata, copied unchanged from the upstream repo
 - `generation_config.json` — generation defaults, copied unchanged from the upstream repo
+A weight-only int8-quantized variant is also published as `onnx/model-int8.onnx`. Use `model-fp32.onnx` unless deployment size is a constraint.
 For the original PyTorch weights and training details, see the upstream model:
 **[Helsinki-NLP/opus-mt_tiny_eng-tur](https://huggingface.co/Helsinki-NLP/opus-mt_tiny_eng-tur)**.
 | Architecture | MarianMT (encoder-decoder) |
 | Max input tokens | 256 |
 | Max output tokens | 512 |
+| ONNX file size | 177 MB (fp32) / 95 MB (int8) |
 | ONNX opset | 14 |
 | ONNX IR version | 8 (BYOM 7.0+ compatible) |
 | License | Apache-2.0 (from upstream) |
 This model was produced with the open-source
 [`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
 package, which exports the encoder/decoder, stitches in the BeamSearch op,
+applies weight-only int8 quantization, and verifies parity against PyTorch on a
+small sample set.
 > **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
 > (including ones not in this collection) to a BYOM-ready ONNX bundle. If

onnx/model-int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a517d628f907e5261cacf2d797460088303fd485955598872a1741491a78e7d8
+size 94727718