--- base_model: answerdotai/ModernBERT-base library_name: transformers pipeline_tag: text-classification tags: - text-classification - legal - locus - modernbert license: apache-2.0 datasets: - LocalLaws/LOCUS-v1.0 --- # LocalLaws/LOCUS-Substantive A ModernBERT classifier for the **Substantive (binary)** axis of the LOCUS (Local Ordinances Corpus, United States) dataset. Fine-tuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on [LocalLaws/LOCUS-v1.0](https://huggingface.co/datasets/LocalLaws/LOCUS-v1.0). ## Labels - `not_substantive` - `substantive` ## Training | | | |---|---| | Base model | `answerdotai/ModernBERT-base` | | Max length | 1024 | | Classifier pooling | `mean` | | Train / val / test | 79106 / 10447 / 10447 | ## Evaluation | | | |---|---| | Metric | binary-F1 | | Validation binary-F1 | 0.9402 | | Test binary-F1 | 0.9422 | | Test accuracy | 0.9328 | ``` precision recall f1-score support 0 0.9517 0.8898 0.9197 4519 1 0.9200 0.9656 0.9422 5928 accuracy 0.9328 10447 macro avg 0.9358 0.9277 0.9310 10447 weighted avg 0.9337 0.9328 0.9325 10447 ``` ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch tok = AutoTokenizer.from_pretrained("LocalLaws/LOCUS-Substantive") model = AutoModelForSequenceClassification.from_pretrained("LocalLaws/LOCUS-Substantive") model.eval() text = "No person shall keep any swine within the city limits." enc = tok(text, return_tensors="pt", truncation=True, max_length=1024) with torch.no_grad(): logits = model(**enc).logits pred = logits.argmax(-1).item() print(model.config.id2label[pred]) ```