thammegowda commited on
Commit
8f41cc4
·
verified ·
1 Parent(s): 3128a4a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: translation
3
+ ---
4
+
5
+ # cometoid22-wmt23
6
+
7
+ A referenceless/quality-estimation metric for machine translation evaluation.
8
+ This metric is created by using the knowledge distillation of [wmt22-comet-da](https://huggingface.co/Unbabel/wmt22-comet-da) (a referece-based teacher).
9
+ Refer to [the publication](https://aclanthology.org/2023.wmt-1.62) for technical details.
10
+
11
+
12
+ ## Setup
13
+
14
+ Option 1: Install `pymarain`, aka python bindings to marian
15
+
16
+ ```bash
17
+ pip install pymarian
18
+ ```
19
+
20
+ Option 2: Build marian binary, reference: https://marian-nmt.github.io/quickstart/
21
+
22
+
23
+ ## Usage
24
+
25
+ **Pymarian**
26
+ ```bash
27
+ pymarian-eval -m checkpoints/marian.model.bin -v vocab.spm --like comet-qe -s src.txt -t mt.out.txt
28
+ ```
29
+
30
+ **Marian**
31
+
32
+ ```bash
33
+ paste src.txt mt.out.txt | marian evaluate --quiet --model checkpoints/marian.model.bin --vocabs vocab.spm vocab.spm --width 4 --like comet-qe \
34
+ --mini-batch 16 --maxi-batch 256 --max-length 512 --max-length-crop true --workspace 8000
35
+ ```
36
+
37
+
38
+ More info at https://github.com/marian-nmt/wmt23-metrics
39
+
40
+
41
+
42
+ ## Reference
43
+ ```
44
+ @inproceedings{gowda-etal-2023-cometoid,
45
+ title = "Cometoid: Distilling Strong Reference-based Machine Translation Metrics into {E}ven Stronger Quality Estimation Metrics",
46
+ author = "Gowda, Thamme and
47
+ Kocmi, Tom and
48
+ Junczys-Dowmunt, Marcin",
49
+ editor = "Koehn, Philipp and
50
+ Haddon, Barry and
51
+ Kocmi, Tom and
52
+ Monz, Christof",
53
+ booktitle = "Proceedings of the Eighth Conference on Machine Translation",
54
+ month = dec,
55
+ year = "2023",
56
+ address = "Singapore",
57
+ publisher = "Association for Computational Linguistics",
58
+ url = "https://aclanthology.org/2023.wmt-1.62",
59
+ pages = "751--755",
60
+ }
61
+ ```