| # Model Overview | |
| This model is a fine-tuned version of the Helsinki-NLP OPUS-MT model for multiple language pairs. It has been fine-tuned on the Tatoeba dataset for the following language pairs: | |
| English to Marathi (en-mr) | |
| Esperanto to Dutch (eo-nl) | |
| Spanish to Portuguese (es-pt) | |
| French to Russian (fr-ru) | |
| Spanish to Galician (es-gl) | |
| The model supports sequence-to-sequence translation and has been optimized for performance using FP16 quantization. | |
| # Model Details | |
| ``` | |
| Base Model: Helsinki-NLP/opus-mt-en-roa | |
| Training Dataset: Tatoeba dataset | |
| Fine-tuned Language Pairs: en-mr, eo-nl, es-pt, fr-ru, es-gl | |
| Evaluation Metric: BLEU Score (using sacreBLEU) | |
| Training Framework: Hugging Face Transformers | |
| Training Configuration | |
| Optimizer: AdamW | |
| Learning Rate: 2e-5 | |
| Batch Size: 16 (per device) | |
| Weight Decay: 0.01 | |
| Epochs: 3 | |
| Precision: FP32 (initial training), converted to FP16 for inference | |
| ``` | |
| Quantization and FP16 Conversion | |
| To improve inference efficiency, models were converted to FP16: | |
| # Inference Example | |
| ``` | |
| python | |
| from transformers import AutoModelForSeq2SeqLM, AutoTokenizer | |
| import torch | |
| model = AutoModelForSeq2SeqLM.from_pretrained("fine_tuned_models_fp16/en-mr/final/", torch_dtype=torch.float16).to("cuda") | |
| tokenizer = AutoTokenizer.from_pretrained("fine_tuned_models_fp16/en-mr/final/") | |
| inputs = tokenizer("Hello, how are you?", return_tensors="pt").to("cuda") | |
| outputs = model.generate(**inputs) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| # Usage | |
| The models can be used for translation tasks in various NLP applications, including chatbots, document translation, and real-time communication. | |
| # Limitations | |
| May not generalize well for domain-specific text. | |
| FP16 quantization may lead to minor loss in precision. | |
| Translation accuracy depends on the dataset quality. | |
| # Citation | |
| If you use this model, please cite the original OPUS-MT paper and acknowledge the fine-tuning process conducted using the Tatoeba dataset. | |