mbart-neutralization-tarea-adicional

This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2519
  • Bleu: 87.0261
  • Gen Len: 18.4048

Model description

This model is a fine-tuned version of mBART-large-50 adapted for the task of gender neutralization in Spanish. It reformulates gender-marked Spanish sentences into gender-neutral alternatives while preserving the original semantic content.

The model follows a sequence-to-sequence (text-to-text) architecture and was trained to transform biased or explicitly gendered constructions (e.g., “los alumnos y las alumnas”) into neutralized forms (e.g., “el alumnado”). The objective is to reduce explicit binary gender marking while maintaining grammatical correctness and meaning.

The base architecture is facebook/mbart-large-50-many-to-many-mmt, a multilingual encoder-decoder Transformer model pretrained for multilingual translation and generation tasks.

Intended uses & limitations

Intended uses

  • Gender-neutral reformulation of Spanish text
  • Bias reduction experiments in NLP
  • Academic research in inclusive language generation
  • Educational purposes (demonstrating fine-tuning of multilingual models)

The model is intended for controlled text rewriting tasks where gender-neutral reformulation is required.

Limitations

  • The model may produce grammatically awkward or unnatural reformulations.
  • Neutralization may sometimes alter stylistic nuance.
  • It may fail in highly context-dependent or idiomatic expressions.
  • The model does not perform semantic bias detection — it only rewrites based on learned patterns.
  • Performance depends heavily on similarity between training data and inference data.

This model should not be used as a definitive bias mitigation tool without human review.

Training and evaluation data

The model was trained on a parallel dataset consisting of Spanish sentences paired with their gender-neutral reformulations. The dataset includes examples of explicitly gender-marked constructions and their neutral alternatives.

The data focuses on:

  • Plural masculine generics
  • Coordinated gender forms (“los niños y las niñas”)
  • Profession and role nouns with gender marking
  • Common educational and social contexts

The dataset was split into training and validation subsets. Evaluation was performed using automatic metrics such as BLEU score to measure similarity between generated neutralized outputs and reference neutral sentences.

No personally identifiable information (PII) was intentionally included in the dataset.

Training procedure

The model was fine-tuned using the Hugging Face Transformers library with a sequence-to-sequence training setup.

Base model

facebook/mbart-large-50-many-to-many-mmt

Framework

  • PyTorch
  • Hugging Face Transformers

Training setup

  • Task: Text-to-text generation
  • Objective: Cross-entropy loss
  • Optimizer: AdamW
  • Evaluation strategy: Periodic evaluation on validation set
  • Metric: BLEU
  • Beam search decoding during evaluation

Hyperparameters (example structure — adjust if needed)

  • Learning rate: 5e-5 (or the value you used)
  • Batch size: 8 (or your actual value)
  • Number of epochs: 2–3 (depending on your training)
  • Max sequence length: 128–200 tokens
  • Num beams (generation): 5

The model was trained until convergence based on validation loss and BLEU score.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 472 0.2977 85.4471 18.2429
0.9756 2.0 944 0.2519 87.0261 18.4048

Framework versions

  • Transformers 4.51.2
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CarolGuga/mbart-neutralization-tarea-adicional

Finetuned
(344)
this model