Datasets used to train NMT supervised task ?

#25

by OrianeN - opened Mar 11, 2024

If I understood the paper correctly, the T5-small was trained on supervised tasks including NMT in 3 directions: EN>FR, EN>RO and EN>DE.

Yet I can't find the datasets used to train on these supervised tasks, could you please add them to the model card ?

If I'm not mistaken, the paper mentions the following training datasets for the NMT tasks:

EN>DE: News Commentary v13, Common Crawl, Europarl v7 (+ newstest2013 for validation)
EN>FR: WMT 2015 (+ newstest2014 for validation)
EN>RO: WMT 2016

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment