mbart-de-es
This model is a fine-tuned version of facebook/mbart-large-50 on Helsinki-NLP/opus_books (https://huggingface.co/datasets/Helsinki-NLP/opus_books). It achieves the following results on the evaluation set:
- Loss: 2.4208
- Bleu: 8.1685
- Gen Len: 28.2661
Model description
This model has been delivered as part of a class project.
Intended uses & limitations
The model is aim to translating from German to Spanish. As the low BLEU score shows, this model has very low accuracy, given the limited computing resources used for its creation.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 2
Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
|---|---|---|---|---|---|
| 3.0574 | 1.0 | 508 | 2.4566 | 7.5927 | 27.4213 |
| 1.8147 | 2.0 | 1016 | 2.4208 | 8.1685 | 28.2661 |
Framework versions
- Transformers 4.51.2
- Pytorch 2.10.0+cu128
- Datasets 4.0.0
- Tokenizers 0.21.4
- Downloads last month
- 69
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ntr2026/mbart-de-es
Base model
facebook/mbart-large-50