--- language: - eo - en - es - ca tags: - translation - machine-translation - marian - opus-mt - multilingual license: cc-by-4.0 pipeline_tag: translation metrics: - bleu - chrf --- # Esperanto -> Catalan, English, Spanish MT Model ## Model description This repository contains a **multilingual MarianMT** model for **Esperanto → (English, Spanish, Catalan)** translation using language tags with tiny architecture. This model is **not intended for direct inference through the Hugging Face `transformers` library**. Use [**Marian**](https://marian-nmt.github.io/docs/) for inference instead. The repository includes the following files: - `model.npz.best-chrf.npz` — trained Marian model checkpoint - `tiny.decoder.yml` — decoder configuration - `vocab.spm` — SentencePiece vocabulary - `run_model.sh ` — Example script on how to run the model ### Supported target languages (via tags) You control the target language by prefixing the source sentence with one of the following tags: * `>>eng<<` → English * `>>spa<<` → Spanish * `>>cat<<` → Catalan ## Training data The model was trained using **Tatoeba** parallel data, with **FLORES-200** used as the development set. Training sentence-pair counts: * **ca-eo**: 672,931 * **es-eo**: 4,677,945 * **eo-en**: 5,000,000 ## Inference Run decoding from inside the model directory: ```bash cat input.epo | sed "s/^/>>cat<< /" \ marian-decoder \ -c tiny.decoder.yml \ --output output.cat \ --normalize \ -m model.npz.best-chrf.npz \ --vocabs vocab.spm vocab.spm \ --log decode.log \ --devices 0 ```