---
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- text-classification
- modernbert
- legal
- glam
- jim-crow
- north-carolina
- history
- generated_from_trainer
datasets:
- biglam/on_the_books
language:
- en
metrics:
- accuracy
- f1
- precision
- recall
- roc_auc
model-index:
- name: jim-crow-laws-claude-code
  results:
  - task:
      type: text-classification
      name: Binary text classification
    dataset:
      name: biglam/on_the_books
      type: biglam/on_the_books
      split: train (held-out 20% stratified)
    metrics:
      - type: accuracy
        value: 0.9776
      - type: f1
        value: 0.9619
        name: F1 (jim_crow class)
      - type: precision
        value: 0.9352
        name: Precision (jim_crow class)
      - type: recall
        value: 0.9902
        name: Recall (jim_crow class)
      - type: roc_auc
        value: 0.9965
---

# jim-crow-laws-claude-code

A binary text classifier that flags whether a North Carolina session-law section
(1866–1967) is a **Jim Crow law**. Fine-tuned from
[`answerdotai/ModernBERT-base`](https://huggingface.co/answerdotai/ModernBERT-base)
on [`biglam/on_the_books`](https://huggingface.co/datasets/biglam/on_the_books),
the labeled training set from UNC Chapel Hill Libraries' *On the Books: Jim Crow
and Algorithms of Resistance* project.

## Intended use

- Surface candidate Jim Crow laws within historical NC session-law corpora to
  support archival, library, and digital-humanities work.
- Reproduce / extend the *On the Books* methodology on related corpora.
- Teaching: ML-for-cultural-heritage, computational legal history, OCR-tolerant
  text classification.

The original *On the Books* project trained a classifier on this data and ran it
over the **full ~century corpus**. This model is a re-training of that idea with
a modern long-context encoder (ModernBERT) and is intended to be applied the
same way: as a *retrieval / triage* tool whose flagged outputs are then reviewed
by domain experts.

## Out-of-scope / limitations

- **Jurisdiction:** trained on **North Carolina** session laws only. Patterns
  will not transfer cleanly to other states without adaptation.
- **Period:** 1866–1967 legal language. Modern statutes differ substantially.
- **OCR noise:** training texts contain period-OCR errors; expect degraded
  performance on cleaner or differently-OCR'd inputs.
- **Label scope:** the negative class means *"not flagged by the project's
  labeling process"* — laws with discriminatory effect that the source
  compilations did not catalogue may be present in the negatives. Treat model
  predictions as candidates for review, not ground truth.
- **Class imbalance:** training data is ~29% positive; trained with
  inverse-frequency class weights to compensate.

Per the dataset's authors, the texts include slurs and dehumanising language
present in the historical record. Downstream users should preserve the
project's framing and not strip the historical context.

## How to use

```python
from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="davanstrien/jim-crow-laws-claude-code",
)

text = "..."  # text of a single law section
print(clf(text))
# [{'label': 'jim_crow', 'score': 0.99}]
```

Labels: `no_jim_crow` (0) and `jim_crow` (1).

## Training data

- **Dataset:** [`biglam/on_the_books`](https://huggingface.co/datasets/biglam/on_the_books)
  (1,785 rows; single `train` split).
- **Input field used:** `section_text` (the OCR text of the labeled section).
  `chapter_text` and `source` were ignored — `source` would leak the label
  (`paschal` is 100% positive, `murray` is 92% positive).
- **Split:** stratified 80/20 train/eval split (seed 42) — 1,428 train / 357
  eval, preserving the ~29% positive rate in both.

## Training procedure

- **Base model:** `answerdotai/ModernBERT-base` (~150M params, 8K context).
- **Max sequence length:** 1024 tokens (covers ~95th percentile of
  `section_text` token lengths; long-tail truncated).
- **Loss:** cross-entropy with **inverse-frequency class weights** computed
  from the training split (`[0.701, 1.741]`) to handle class imbalance.
- **Hardware:** trained on a single L4 GPU via `hf jobs uv run`.

### Hyperparameters

| | |
|---|---|
| Optimizer | AdamW (fused), β=(0.9, 0.999), ε=1e-8 |
| Learning rate | 3e-5 |
| LR schedule | Linear with 10% warmup |
| Weight decay | 0.01 |
| Train batch size | 16 |
| Eval batch size | 32 |
| Epochs | 5 |
| Precision | bf16 |
| Seed | 42 |
| Best-model selection | F1 on `jim_crow` class |

### Training results

Best checkpoint selected by `f1_jim_crow` on the held-out eval split (epoch 3):

| Metric | Value |
|---|---|
| Accuracy | 0.9776 |
| Precision (jim_crow) | 0.9352 |
| Recall (jim_crow) | 0.9902 |
| F1 (jim_crow) | 0.9619 |
| F1 (macro) | 0.9730 |
| ROC AUC | 0.9965 |

Per-epoch eval:

| Training Loss | Epoch | Step | Val Loss | Accuracy | Precision (jim_crow) | Recall (jim_crow) | F1 (jim_crow) | F1 macro | ROC AUC |
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| 0.2893 | 1 | 90  | 0.1920 | 0.9524 | 0.8972 | 0.9412 | 0.9187 | 0.9425 | 0.9913 |
| 0.0716 | 2 | 180 | 0.0793 | 0.9776 | 0.9519 | 0.9706 | 0.9612 | 0.9727 | 0.9971 |
| 0.1101 | 3 | 270 | 0.1205 | 0.9776 | 0.9352 | 0.9902 | 0.9619 | 0.9730 | 0.9965 |
| 0.0027 | 4 | 360 | 0.1251 | 0.9776 | 0.9352 | 0.9902 | 0.9619 | 0.9730 | 0.9958 |
| 0.0001 | 5 | 450 | 0.1231 | 0.9748 | 0.9346 | 0.9804 | 0.9569 | 0.9696 | 0.9960 |

Held-out eval is small (357 rows; 102 positive). Treat differences in the
fourth decimal as noise.

## Citation

Please cite the original *On the Books* project for the data and methodology:

```
On the Books: Jim Crow and Algorithms of Resistance.
University of North Carolina at Chapel Hill Libraries.
https://onthebooks.lib.unc.edu
DOI: https://doi.org/10.17615/5c4g-sd44
```

### Framework versions

- Transformers 5.7.0
- PyTorch 2.11.0+cu130
- Datasets 4.8.5
- Tokenizers 0.22.2