marian-nmt
/

cometoid22-wmt21

Model card Files Files and versions

cometoid22-wmt21 / README.md

thammegowda's picture

Update README.md

c49a354 verified over 1 year ago

|

history blame contribute delete

1.7 kB

	---
	pipeline_tag: translation
	---

	# cometoid22-wmt21

	A referenceless/quality-estimation metric for machine translation evaluation.
	This metric is created by using the knowledge distillation of [wmt22-comet-da](https://huggingface.co/Unbabel/wmt22-comet-da) (a referece-based teacher).
	Refer to [the publication](https://aclanthology.org/2023.wmt-1.62) for technical details.


	## Setup

	Option 1: Install `pymarian`, aka Python bindings to Marian

	```bash
	pip install pymarian
	```

	Option 2: Build marian binary, reference: https://marian-nmt.github.io/quickstart/


	## Usage

	Pymarian
	```bash
	pymarian-eval -m checkpoints/marian.model.bin -v vocab.spm --like comet-qe -s src.txt -t mt.out.txt
	```

	Marian

	```bash
	paste src.txt mt.out.txt \| marian evaluate --quiet --model checkpoints/marian.model.bin --vocabs vocab.spm vocab.spm --width 4 --like comet-qe \
	--mini-batch 16 --maxi-batch 256 --max-length 512 --max-length-crop true --workspace 8000
	```


	More info at https://github.com/marian-nmt/wmt23-metrics



	## Reference
	```
	@inproceedings{gowda-etal-2023-cometoid,
	title = "Cometoid: Distilling Strong Reference-based Machine Translation Metrics into {E}ven Stronger Quality Estimation Metrics",
	author = "Gowda, Thamme and
	Kocmi, Tom and
	Junczys-Dowmunt, Marcin",
	editor = "Koehn, Philipp and
	Haddon, Barry and
	Kocmi, Tom and
	Monz, Christof",
	booktitle = "Proceedings of the Eighth Conference on Machine Translation",
	month = dec,
	year = "2023",
	address = "Singapore",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2023.wmt-1.62",
	pages = "751--755",
	}
	```