YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
Citation:
Cite this model as:
@misc{himel_ghosh_2025,
author = { Himel Ghosh },
title = { bias-neutralizer-t5s (Revision 081d451) },
year = 2025,
url = { https://huggingface.co/himel7/bias-neutralizer-t5s },
doi = { 10.57967/hf/5539 },
publisher = { Hugging Face }
}
- Developed by: Himel Ghosh
- Language(s) (NLP): English
- Finetuned from model: t5-small
Uses
Intended for Bias neutralisation in news-media and NLP researchers.
Direct Use
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
# Load the tokenizer and model
model_name = "himel7/bias-neutralizer-t5s"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Move to GPU if available
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Define inference function
def neutralize_bias(sentence):
input_text = "neutralize: " + sentence
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True).to(device)
output_ids = model.generate(**inputs, max_length=128, num_beams=4)
return tokenizer.decode(output_ids[0], skip_special_tokens=True)
# Example
biased = "The brilliant leader saved the country with his unmatched wisdom."
neutralized = neutralize_bias(biased)
Training Details
Training Data
The base model t5-small is finetuned with Wiki Neutrality Corpus (WNC) introduced by https://arxiv.org/abs/1911.09709. Cite their data as:
@misc{pryzant2019automaticallyneutralizingsubjectivebias, title={Automatically Neutralizing Subjective Bias in Text}, author={Reid Pryzant and Richard Diehl Martinez and Nathan Dass and Sadao Kurohashi and Dan Jurafsky and Diyi Yang}, year={2019}, eprint={1911.09709}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/1911.09709}, }
Training Procedure
This model, himel7/bias-neutralizer-t5s, is a fine-tuned version of t5-small trained on the Wiki Neutrality Corpus (WNC). It was trained for the task of bias neutralization: transforming biased sentences into neutral versions while preserving meaning.
Training configuration:
Base model: t5-small
Dataset: biased.word.train split from WNC (single-word edits subset)
Task format: "neutralize: " โ
Epochs: 5
Batch size: 8 per device
Learning rate: 2e-5
Optimizer: AdamW (default Hugging Face setup)
Loss function: Cross-entropy with teacher forcing
Training time: ~30 minutes
Hardware: NVIDIA RTX A6000 (48 GB VRAM)
The model was trained using Hugging Face's Seq2SeqTrainer with beam search (beam width = 4) for inference. It achieves strong performance on automatic metrics (BLEU, accuracy).
Results
๐ ROUGE Scores: rouge1: 0.9654 rouge2: 0.9302 rougeL: 0.9653 rougeLsum: 0.9653
๐ BLEU Score: 0.9301
- Downloads last month
- -
Model tree for himel7/bias-neutralizer-t5s
Base model
google-t5/t5-small