File size: 1,963 Bytes
3dc34e0 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | ---
base_model: answerdotai/ModernBERT-base
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- legal
- locus
- modernbert
license: apache-2.0
datasets:
- LocalLaws/LOCUS-v1.0
---
# LocalLaws/LOCUS-Topic
A ModernBERT classifier for the **Topic** axis of the LOCUS
(Local Ordinances Corpus, United States) dataset.
Fine-tuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on
[LocalLaws/LOCUS-v1.0](https://huggingface.co/datasets/LocalLaws/LOCUS-v1.0).
## Labels
- `Buildings`
- `Business`
- `Nuisance`
- `Other`
- `Zoning`
## Training
| | |
|---|---|
| Base model | `answerdotai/ModernBERT-base` |
| Max length | 1024 |
| Classifier pooling | `mean` |
| Train / val / test | 45183 / 5848 / 5928 |
## Evaluation
| | |
|---|---|
| Metric | macro-F1 |
| Validation macro-F1 | 0.8127 |
| Test macro-F1 | 0.8173 |
| Test accuracy | 0.8190 |
```
precision recall f1-score support
Buildings 0.7438 0.8506 0.7936 877
Business 0.8273 0.8381 0.8326 846
Nuisance 0.7617 0.8419 0.7998 930
Other 0.8916 0.7657 0.8239 2083
Zoning 0.8169 0.8574 0.8367 1192
accuracy 0.8190 5928
macro avg 0.8083 0.8307 0.8173 5928
weighted avg 0.8251 0.8190 0.8194 5928
```
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tok = AutoTokenizer.from_pretrained("LocalLaws/LOCUS-Topic")
model = AutoModelForSequenceClassification.from_pretrained("LocalLaws/LOCUS-Topic")
model.eval()
text = "No person shall keep any swine within the city limits."
enc = tok(text, return_tensors="pt", truncation=True, max_length=1024)
with torch.no_grad():
logits = model(**enc).logits
pred = logits.argmax(-1).item()
print(model.config.id2label[pred])
```
|