may-ohta/iwslt14
Updated • 102
This is a JoeyNMT model for multilingual MT with language tags, built for a demo purpose. The model is trained on iwslt14 de-en / en-fr parallel data using DDP.
Install JoeyNMT v2.3:
$ pip install git+https://github.com/joeynmt/joeynmt.git
Torch hub interface:
import torch
iwslt14 = torch.hub.load("joeynmt/joeynmt", "iwslt14_prompt")
translation = iwslt14.translate(
src=["Hello world!"], # src sentence
src_prompt=["<en>"], # src language code
trg_prompt=["<de>"], # trg language code
beam_size=1,
)
print(translation) # ["Hallo Welt!"]
(See jupyter notebook for details)
$ python -m joeynmt train iwslt14_prompt/config.yaml --use-ddp --skip-test
(See train.log for details)
$ git clone https://huggingface.co/may-ohta/iwslt14_prompt
$ python -m joeynmt test iwslt14_prompt/config.yaml --output-path iwslt14_prompt/hyp
| direction | bleu |
|---|---|
| en->de | 28.88 |
| de->en | 35.28 |
| en->fr | 38.86 |
| fr->en | 40.35 |
nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.4.0(See test.log for details)
We downloaded IWSLT14 de-en and en-fr from https://wit3.fbk.eu/2014-01 and created {train|dev|test}.tsv files in the following format:
| src_prompt | src | trg_prompt | trg |
|---|---|---|---|
<en> |
Hello. | <de> |
Hallo. |
<de> |
Vielen Dank! | <en> |
Thank you! |
(See test.ref.de-en.tsv)