mbart-neutralization
This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0138
- Bleu: 98.4772
- Gen Len: 18.5104
Model description
This model is a fine-tuned version of facebook/mbart-large-50, a multilingual sequence-to-sequence Transformer model, adapted for the task of Spanish gender neutralization. The goal of the model is to transform gender-marked Spanish sentences into gender-neutral reformulations, preserving meaning while reducing grammatical gender marking. This task can be framed as a monolingual translation problem (Spanish → neutral Spanish). The model was trained using the Hugging Face Transformers library and follows a standard encoder–decoder architecture with transfer learning from the pretrained mBART model. The resulting system performs controlled rewriting rather than translation between languages, making it suitable for experiments in:
- inclusive language generation
- stylistic rewriting
- bias reduction in text
- controlled text transformation
Intended uses & limitations
This model is intended for:
- Research experiments in NLP and inclusive language
- Educational purposes in courses on Machine Translation or Text Generation
- Demonstrations of transfer learning using multilingual seq2seq models
- Automatic rewriting of short Spanish sentences into gender-neutral forms
Training and evaluation data
The model was trained on the Spanish Gender Neutralization dataset available on Hugging Face: 👉 hackathon-pln-es/neutral-es This dataset contains pairs of aligned sentences:
- gendered: original sentence with grammatical gender marking
- neutral: reformulated gender-neutral version
The dataset already includes a predefined split:
- Training set
- Test set
The dataset is relatively small and designed mainly for educational and experimental purposes, not for large-scale production systems. Before training, the data was:
- tokenized using the mBART tokenizer
- truncated/padded to model limits
- converted into input/label format for seq2seq training
Evaluation was performed using the BLEU score (sacrebleu), a standard metric in machine translation.
Training procedure
The model was trained using the Hugging Face Trainer API for sequence-to-sequence learning. Training steps:
- The pretrained model facebook/mbart-large-50 was loaded.
- The dataset was tokenized using the corresponding mBART tokenizer.
- Inputs were formatted as:
- source: gendered sentence
- target: neutral sentence
- The model was fine-tuned using transfer learning.
- Training was performed on GPU in Google Colab.
- Evaluation during training used the sacrebleu metric.
- The final model was uploaded to the Hugging Face Hub.
The model therefore learns to perform monolingual rewriting via multilingual translation architecture.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 2
Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
|---|---|---|---|---|---|
| No log | 1.0 | 440 | 0.0246 | 98.2861 | 18.5729 |
| 0.2226 | 2.0 | 880 | 0.0138 | 98.4772 | 18.5104 |
Framework versions
- Transformers 4.51.2
- Pytorch 2.10.0+cu128
- Datasets 4.6.0
- Tokenizers 0.21.4
- Downloads last month
- 2
Model tree for CarolGuga/mbart-neutralization
Base model
facebook/mbart-large-50