Jim Crow law classifier

This model is a distilbert-base-uncased sequence classifier fine-tuned on biglam/on_the_books to identify statutory sections labeled as Jim Crow laws.

Input format used for training concatenates metadata, chapter text, and section text. Labels are:

  • 0: no_jim_crow
  • 1: jim_crow

Held-out stratified validation split: 20% of the dataset, seed 55.

Evaluation metrics

{
  "epoch": 5.0,
  "eval_accuracy": 0.9859943977591037,
  "eval_f1_jim_crow": 0.975609756097561,
  "eval_loss": 0.09546805918216705,
  "eval_macro_f1": 0.9828932866931812,
  "eval_precision_jim_crow": 0.970873786407767,
  "eval_recall_jim_crow": 0.9803921568627451,
  "eval_roc_auc": 0.9928681276432142,
  "eval_runtime": 1.2417,
  "eval_samples_per_second": 287.505,
  "eval_steps_per_second": 9.664
}

The training script used class-weighted cross-entropy to account for label imbalance.

Downloads last month
24
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for evalstate/jim-crow-test-gpt-55

Finetuned
(11483)
this model

Dataset used to train evalstate/jim-crow-test-gpt-55