mbart-neutralization

This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0138
Bleu: 98.4772
Gen Len: 18.5104

Model description

This model is a fine-tuned version of facebook/mbart-large-50, a multilingual sequence-to-sequence Transformer model, adapted for the task of Spanish gender neutralization. The goal of the model is to transform gender-marked Spanish sentences into gender-neutral reformulations, preserving meaning while reducing grammatical gender marking. This task can be framed as a monolingual translation problem (Spanish → neutral Spanish). The model was trained using the Hugging Face Transformers library and follows a standard encoder–decoder architecture with transfer learning from the pretrained mBART model. The resulting system performs controlled rewriting rather than translation between languages, making it suitable for experiments in:

inclusive language generation
stylistic rewriting
bias reduction in text
controlled text transformation

Intended uses & limitations

This model is intended for:

Research experiments in NLP and inclusive language
Educational purposes in courses on Machine Translation or Text Generation
Demonstrations of transfer learning using multilingual seq2seq models
Automatic rewriting of short Spanish sentences into gender-neutral forms

Training and evaluation data

The model was trained on the Spanish Gender Neutralization dataset available on Hugging Face: 👉 hackathon-pln-es/neutral-es This dataset contains pairs of aligned sentences:

gendered: original sentence with grammatical gender marking
neutral: reformulated gender-neutral version

The dataset already includes a predefined split:

Training set
Test set

The dataset is relatively small and designed mainly for educational and experimental purposes, not for large-scale production systems. Before training, the data was:

tokenized using the mBART tokenizer
truncated/padded to model limits
converted into input/label format for seq2seq training

Evaluation was performed using the BLEU score (sacrebleu), a standard metric in machine translation.

Training procedure

The model was trained using the Hugging Face Trainer API for sequence-to-sequence learning. Training steps:

The pretrained model facebook/mbart-large-50 was loaded.
The dataset was tokenized using the corresponding mBART tokenizer.
Inputs were formatted as:
- source: gendered sentence
- target: neutral sentence
The model was fine-tuned using transfer learning.
Training was performed on GPU in Google Colab.
Evaluation during training used the sacrebleu metric.
The final model was uploaded to the Hugging Face Hub.

The model therefore learns to perform monolingual rewriting via multilingual translation architecture.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	440	0.0246	98.2861	18.5729
0.2226	2.0	880	0.0138	98.4772	18.5104

Framework versions

Transformers 4.51.2
Pytorch 2.10.0+cu128
Datasets 4.6.0
Tokenizers 0.21.4

Downloads last month: 2

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CarolGuga/mbart-neutralization

Base model

facebook/mbart-large-50

Finetuned

(350)

this model