mbart-neutralization-tarea-adicional

This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2519
Bleu: 87.0261
Gen Len: 18.4048

Model description

This model is a fine-tuned version of mBART-large-50 adapted for the task of gender neutralization in Spanish. It reformulates gender-marked Spanish sentences into gender-neutral alternatives while preserving the original semantic content.

The model follows a sequence-to-sequence (text-to-text) architecture and was trained to transform biased or explicitly gendered constructions (e.g., “los alumnos y las alumnas”) into neutralized forms (e.g., “el alumnado”). The objective is to reduce explicit binary gender marking while maintaining grammatical correctness and meaning.

The base architecture is facebook/mbart-large-50-many-to-many-mmt, a multilingual encoder-decoder Transformer model pretrained for multilingual translation and generation tasks.

Intended uses & limitations

Intended uses

Gender-neutral reformulation of Spanish text
Bias reduction experiments in NLP
Academic research in inclusive language generation
Educational purposes (demonstrating fine-tuning of multilingual models)

The model is intended for controlled text rewriting tasks where gender-neutral reformulation is required.

Limitations

The model may produce grammatically awkward or unnatural reformulations.
Neutralization may sometimes alter stylistic nuance.
It may fail in highly context-dependent or idiomatic expressions.
The model does not perform semantic bias detection — it only rewrites based on learned patterns.
Performance depends heavily on similarity between training data and inference data.

This model should not be used as a definitive bias mitigation tool without human review.

Training and evaluation data

The model was trained on a parallel dataset consisting of Spanish sentences paired with their gender-neutral reformulations. The dataset includes examples of explicitly gender-marked constructions and their neutral alternatives.

The data focuses on:

Plural masculine generics
Coordinated gender forms (“los niños y las niñas”)
Profession and role nouns with gender marking
Common educational and social contexts

The dataset was split into training and validation subsets. Evaluation was performed using automatic metrics such as BLEU score to measure similarity between generated neutralized outputs and reference neutral sentences.

No personally identifiable information (PII) was intentionally included in the dataset.

Training procedure

The model was fine-tuned using the Hugging Face Transformers library with a sequence-to-sequence training setup.

Base model

facebook/mbart-large-50-many-to-many-mmt

Framework

PyTorch
Hugging Face Transformers

Training setup

Task: Text-to-text generation
Objective: Cross-entropy loss
Optimizer: AdamW
Evaluation strategy: Periodic evaluation on validation set
Metric: BLEU
Beam search decoding during evaluation

Hyperparameters (example structure — adjust if needed)

Learning rate: 5e-5 (or the value you used)
Batch size: 8 (or your actual value)
Number of epochs: 2–3 (depending on your training)
Max sequence length: 128–200 tokens
Num beams (generation): 5

The model was trained until convergence based on validation loss and BLEU score.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	472	0.2977	85.4471	18.2429
0.9756	2.0	944	0.2519	87.0261	18.4048

Framework versions

Transformers 4.51.2
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 2

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CarolGuga/mbart-neutralization-tarea-adicional

Base model

facebook/mbart-large-50

Finetuned

(352)

this model