--- pipeline_tag: translation --- # cometoid22-wmt21 A referenceless/quality-estimation metric for machine translation evaluation. This metric is created by using the knowledge distillation of [wmt22-comet-da](https://huggingface.co/Unbabel/wmt22-comet-da) (a referece-based teacher). Refer to [the publication](https://aclanthology.org/2023.wmt-1.62) for technical details. ## Setup Option 1: Install `pymarian`, aka Python bindings to Marian ```bash pip install pymarian ``` Option 2: Build marian binary, reference: https://marian-nmt.github.io/quickstart/ ## Usage **Pymarian** ```bash pymarian-eval -m checkpoints/marian.model.bin -v vocab.spm --like comet-qe -s src.txt -t mt.out.txt ``` **Marian** ```bash paste src.txt mt.out.txt | marian evaluate --quiet --model checkpoints/marian.model.bin --vocabs vocab.spm vocab.spm --width 4 --like comet-qe \ --mini-batch 16 --maxi-batch 256 --max-length 512 --max-length-crop true --workspace 8000 ``` More info at https://github.com/marian-nmt/wmt23-metrics ## Reference ``` @inproceedings{gowda-etal-2023-cometoid, title = "Cometoid: Distilling Strong Reference-based Machine Translation Metrics into {E}ven Stronger Quality Estimation Metrics", author = "Gowda, Thamme and Kocmi, Tom and Junczys-Dowmunt, Marcin", editor = "Koehn, Philipp and Haddon, Barry and Kocmi, Tom and Monz, Christof", booktitle = "Proceedings of the Eighth Conference on Machine Translation", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.wmt-1.62", pages = "751--755", } ```