File size: 3,434 Bytes
b13d804 c647e03 b13d804 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | ---
language: en
license: mit
tags:
- text-classification
- roberta
- normativity
- deontic-logic
- social-norms
base_model:
- FacebookAI/roberta-base
- FacebookAI/roberta-large
datasets:
- SALT-NLP/CultureBank
---
# Normative Statement Classifier — RoBERTa Fine-tunes
A collection of fine-tuned RoBERTa models for detecting **normative statements** in text — sentences and documents that express social norms, obligations, prohibitions, or moral judgments (e.g. *"people should remove their shoes before entering"*).
> Github link for the full project: [Git](https://github.com/AnikMallick/norm-classifier)
---
## Models in this repository
| Subfolder | Base | Description |
|---|---|---|
| `roberta-base-classifier-v01` | `roberta-base` | Baseline fine-tune on norm classification |
| `roberta-base-tapt` | `roberta-base` | Task-Adaptive Pre-Training (TAPT) checkpoint |
| `roberta-large-classifier-v01` | `roberta-large` | Larger model fine-tune for higher capacity |
| `roberta-tapt-classifier-v01` | `roberta-base-tapt` | Fine-tuned on top of the TAPT checkpoint |
---
## Usage — `roberta-base-classifier-v01`
### Load the model
```python
from huggingface_hub import snapshot_download
from transformers import RobertaForSequenceClassification, RobertaTokenizer
import torch
# Download from HF Hub
snapshot_download(
repo_id="anik-owl/roberta_norm_classifier",
allow_patterns="roberta-base-classifier-v01/*",
local_dir="./artifacts",
)
# Load model + tokenizer
model = RobertaForSequenceClassification.from_pretrained(
"./artifacts/roberta-base-classifier-v01",
num_labels=2,
)
tokenizer = RobertaTokenizer.from_pretrained("FacebookAI/roberta-base")
model.eval()
```
### Inference
```python
def predict(text: str, model, tokenizer, threshold: float = 0.5):
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
padding=True,
max_length=256,
)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
prob_norm = probs[0][1].item()
return {
"label": "NORMATIVE" if prob_norm >= threshold else "NOT NORMATIVE",
"score": round(prob_norm, 4),
}
# Example
text = "People should always greet elders with respect."
result = predict(text, model, tokenizer)
print(result)
# {'label': 'NORMATIVE', 'score': 0.9341}
```
### Labels
| ID | Label |
|---|---|
| 0 | NOT NORMATIVE |
| 1 | NORMATIVE |
---
## Intended use
These models are intended for research on computational social science, normative reasoning, and deontic language detection. They were developed as part of a thesis project on identifying normative statements in natural language.
**Not intended for** high-stakes automated decision-making without human review.
---
## Limitations
- Trained on a specific dataset of normative statements — may not generalise to all domains or languages
- Short, context-free sentences may be harder to classify accurately
- Models may reflect biases present in the training data
---
## Citation
If you use these models in your work, please cite this repository:
```bibtex
@misc{anik-owl-normclsf,
author = {anik-owl},
title = {Normative Statement Classifier — RoBERTa Fine-tunes},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/anik-owl/roberta_norm_classifier}},
}
``` |