Papers
arxiv:2401.12413

How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

Published on Jan 22, 2024
Authors:
,
,
,
,

Abstract

Zero-shot translation in multilingual machine translation achieves significant improvements through minimal multi-parallel data fine-tuning while maintaining English-centric translation quality.

AI-generated summary

Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem. A common, albeit resource-consuming, solution is to add as many related translation directions as possible to the training corpus. In this paper, we show that for an English-centric model, surprisingly large zero-shot improvements can be achieved by simply fine-tuning with a very small amount of multi-parallel data. For example, on the EC30 dataset, we obtain up to +21.7 ChrF non-English overall improvements (870 directions) by using only 100 multi-parallel samples while preserving English-centric translation quality. When investigating the size effect of fine-tuning data and its transfer capabilities, we found that already a small, randomly sampled set of fine-tuning directions is sufficient to achieve comparable improvements. The resulting non-English performance is close to the complete translation upper bound. Even in a minimal setting -- fine-tuning with only one single sample -- the well-known off-target issue is almost completely resolved, explaining parts -- but not all -- of the observed improvements in translation quality.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2401.12413
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2401.12413 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.12413 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2401.12413 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.