Jim Crow law classifier
This model is a distilbert-base-uncased sequence classifier fine-tuned on biglam/on_the_books to identify statutory sections labeled as Jim Crow laws.
Input format used for training concatenates metadata, chapter text, and section text. Labels are:
0:no_jim_crow1:jim_crow
Held-out stratified validation split: 20% of the dataset, seed 55.
Evaluation metrics
{
"epoch": 5.0,
"eval_accuracy": 0.9859943977591037,
"eval_f1_jim_crow": 0.975609756097561,
"eval_loss": 0.09546805918216705,
"eval_macro_f1": 0.9828932866931812,
"eval_precision_jim_crow": 0.970873786407767,
"eval_recall_jim_crow": 0.9803921568627451,
"eval_roc_auc": 0.9928681276432142,
"eval_runtime": 1.2417,
"eval_samples_per_second": 287.505,
"eval_steps_per_second": 9.664
}
The training script used class-weighted cross-entropy to account for label imbalance.
- Downloads last month
- 24
Model tree for evalstate/jim-crow-test-gpt-55
Base model
distilbert/distilbert-base-uncased