metadata
datasets:
- LabHC/bias_in_bios
language:
- en
base_model:
- FacebookAI/roberta-base
pipeline_tag: text-classification
RoBERTa-Bios
This model is a roberta-base model fine-tuned for profession classification on the LabHC/bias_in_bios dataset.
It takes biography text as input and predicts the corresponding profession label. The model was trained on the original BIOS training split.
Model details
- Base model:
roberta-base - Dataset:
LabHC/bias_in_bios - Input column:
hard_text - Label column:
profession - Task: profession classification
- Language: English
Training procedure
The model was fine-tuned with the Hugging Face Trainer API.
Main hyperparameters:
BASE_MODEL = "roberta-base"
MAX_LENGTH = 256
NUM_EPOCHS = 3
LEARNING_RATE = 2e-5
TRAIN_BATCH_SIZE = 32
EVAL_BATCH_SIZE = 128
SEED = 42
The model was trained using:
AutoModelForSequenceClassification.from_pretrained(
"roberta-base",
num_labels=num_labels,
)
The best checkpoint was selected according to macro-F1 on the development split.
Evaluation
Performance on the original BIOS test set:
| Evaluation set | Accuracy |
|---|---|
| Original BIOS test set | 0.8689 |