roberta-bios / README.md
Fannyjrd's picture
Create README.md
d44b702 verified
|
Raw
History Blame Contribute Delete
1.31 kB
metadata
datasets:
  - LabHC/bias_in_bios
language:
  - en
base_model:
  - FacebookAI/roberta-base
pipeline_tag: text-classification

RoBERTa-Bios

This model is a roberta-base model fine-tuned for profession classification on the LabHC/bias_in_bios dataset.

It takes biography text as input and predicts the corresponding profession label. The model was trained on the original BIOS training split.

Model details

  • Base model: roberta-base
  • Dataset: LabHC/bias_in_bios
  • Input column: hard_text
  • Label column: profession
  • Task: profession classification
  • Language: English

Training procedure

The model was fine-tuned with the Hugging Face Trainer API.

Main hyperparameters:

BASE_MODEL = "roberta-base"
MAX_LENGTH = 256
NUM_EPOCHS = 3
LEARNING_RATE = 2e-5
TRAIN_BATCH_SIZE = 32
EVAL_BATCH_SIZE = 128
SEED = 42

The model was trained using:

AutoModelForSequenceClassification.from_pretrained(
    "roberta-base",
    num_labels=num_labels,
)

The best checkpoint was selected according to macro-F1 on the development split.

Evaluation

Performance on the original BIOS test set:

Evaluation set Accuracy
Original BIOS test set 0.8689