| --- |
| base_model: answerdotai/ModernBERT-base |
| library_name: transformers |
| pipeline_tag: text-classification |
| tags: |
| - text-classification |
| - legal |
| - locus |
| - modernbert |
| license: apache-2.0 |
| datasets: |
| - LocalLaws/LOCUS-v1.0 |
| --- |
| |
| # LocalLaws/LOCUS-Substantive |
|
|
| A ModernBERT classifier for the **Substantive (binary)** axis of the LOCUS |
| (Local Ordinances Corpus, United States) dataset. |
|
|
| Fine-tuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on |
| [LocalLaws/LOCUS-v1.0](https://huggingface.co/datasets/LocalLaws/LOCUS-v1.0). |
|
|
| ## Labels |
|
|
| - `not_substantive` |
| - `substantive` |
|
|
| ## Training |
|
|
| | | | |
| |---|---| |
| | Base model | `answerdotai/ModernBERT-base` | |
| | Max length | 1024 | |
| | Classifier pooling | `mean` | |
| | Train / val / test | 79106 / 10447 / 10447 | |
|
|
| ## Evaluation |
|
|
| | | | |
| |---|---| |
| | Metric | binary-F1 | |
| | Validation binary-F1 | 0.9402 | |
| | Test binary-F1 | 0.9422 | |
| | Test accuracy | 0.9328 | |
|
|
| ``` |
| precision recall f1-score support |
| |
| 0 0.9517 0.8898 0.9197 4519 |
| 1 0.9200 0.9656 0.9422 5928 |
| |
| accuracy 0.9328 10447 |
| macro avg 0.9358 0.9277 0.9310 10447 |
| weighted avg 0.9337 0.9328 0.9325 10447 |
| |
| ``` |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| import torch |
| |
| tok = AutoTokenizer.from_pretrained("LocalLaws/LOCUS-Substantive") |
| model = AutoModelForSequenceClassification.from_pretrained("LocalLaws/LOCUS-Substantive") |
| model.eval() |
| |
| text = "No person shall keep any swine within the city limits." |
| enc = tok(text, return_tensors="pt", truncation=True, max_length=1024) |
| with torch.no_grad(): |
| logits = model(**enc).logits |
| pred = logits.argmax(-1).item() |
| print(model.config.id2label[pred]) |
| ``` |
|
|