Translation
Transformers
ONNX
Safetensors
Turkish
English
marian
text2text-generation
LiteRT
Helsinki-NLP/tatoeba
openlanguagedata/flores_plus
teradata
Instructions to use Teradata/opus-mt_tiny_tur-eng with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Teradata/opus-mt_tiny_tur-eng with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="Teradata/opus-mt_tiny_tur-eng")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Teradata/opus-mt_tiny_tur-eng") model = AutoModelForSeq2SeqLM.from_pretrained("Teradata/opus-mt_tiny_tur-eng") - Notebooks
- Google Colab
- Kaggle
Initial publish via teradata-opus-translate
Browse files
README.md
CHANGED
|
@@ -33,7 +33,6 @@ packaged for use with the Teradata `mldb.ONNXSeq2Seq` BYOM function.
|
|
| 33 |
**This repository does not redistribute the original model weights.** It contains only:
|
| 34 |
|
| 35 |
- `onnx/model-fp32.onnx` — full-precision ONNX graph
|
| 36 |
-
- `onnx/model-int8.onnx` — dynamically quantized ONNX graph
|
| 37 |
- `tokenizer.json` — repacked Marian tokenizer suitable for BYOM
|
| 38 |
- `config.json` — model architecture metadata, copied unchanged from the upstream repo
|
| 39 |
- `generation_config.json` — generation defaults, copied unchanged from the upstream repo
|
|
@@ -50,7 +49,7 @@ For the original PyTorch weights and training details, see the upstream model:
|
|
| 50 |
| Architecture | MarianMT (encoder-decoder) |
|
| 51 |
| Max input tokens | 256 |
|
| 52 |
| Max output tokens | 512 |
|
| 53 |
-
| ONNX file
|
| 54 |
| ONNX opset | 14 |
|
| 55 |
| ONNX IR version | 8 (BYOM 7.0+ compatible) |
|
| 56 |
| License | Apache-2.0 (from upstream) |
|
|
@@ -121,15 +120,12 @@ FROM mldb.ONNXSeq2Seq(
|
|
| 121 |
print(tdml.DataFrame.from_query(query))
|
| 122 |
```
|
| 123 |
|
| 124 |
-
An int8-quantized variant is also published as `onnx/model-int8.onnx`. The int8 variant does not accept `num_beams` (configured internally).
|
| 125 |
-
|
| 126 |
## How this model was converted
|
| 127 |
|
| 128 |
This model was produced with the open-source
|
| 129 |
[`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
|
| 130 |
package, which exports the encoder/decoder, stitches in the BeamSearch op,
|
| 131 |
-
|
| 132 |
-
small sample set.
|
| 133 |
|
| 134 |
> **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
|
| 135 |
> (including ones not in this collection) to a BYOM-ready ONNX bundle. If
|
|
|
|
| 33 |
**This repository does not redistribute the original model weights.** It contains only:
|
| 34 |
|
| 35 |
- `onnx/model-fp32.onnx` — full-precision ONNX graph
|
|
|
|
| 36 |
- `tokenizer.json` — repacked Marian tokenizer suitable for BYOM
|
| 37 |
- `config.json` — model architecture metadata, copied unchanged from the upstream repo
|
| 38 |
- `generation_config.json` — generation defaults, copied unchanged from the upstream repo
|
|
|
|
| 49 |
| Architecture | MarianMT (encoder-decoder) |
|
| 50 |
| Max input tokens | 256 |
|
| 51 |
| Max output tokens | 512 |
|
| 52 |
+
| ONNX file size | 177 MB |
|
| 53 |
| ONNX opset | 14 |
|
| 54 |
| ONNX IR version | 8 (BYOM 7.0+ compatible) |
|
| 55 |
| License | Apache-2.0 (from upstream) |
|
|
|
|
| 120 |
print(tdml.DataFrame.from_query(query))
|
| 121 |
```
|
| 122 |
|
|
|
|
|
|
|
| 123 |
## How this model was converted
|
| 124 |
|
| 125 |
This model was produced with the open-source
|
| 126 |
[`teradata-opus-translate`](https://pypi.org/project/teradata-opus-translate/)
|
| 127 |
package, which exports the encoder/decoder, stitches in the BeamSearch op,
|
| 128 |
+
and verifies parity against PyTorch on a small sample set.
|
|
|
|
| 129 |
|
| 130 |
> **Note:** the same package can convert *any* Helsinki-NLP MarianMT model
|
| 131 |
> (including ones not in this collection) to a BYOM-ready ONNX bundle. If
|