anik-owl's picture
Update README.md
c647e03 verified
---
language: en
license: mit
tags:
- text-classification
- roberta
- normativity
- deontic-logic
- social-norms
base_model:
- FacebookAI/roberta-base
- FacebookAI/roberta-large
datasets:
- SALT-NLP/CultureBank
---
# Normative Statement Classifier — RoBERTa Fine-tunes
A collection of fine-tuned RoBERTa models for detecting **normative statements** in text — sentences and documents that express social norms, obligations, prohibitions, or moral judgments (e.g. *"people should remove their shoes before entering"*).
> Github link for the full project: [Git](https://github.com/AnikMallick/norm-classifier)
---
## Models in this repository
| Subfolder | Base | Description |
|---|---|---|
| `roberta-base-classifier-v01` | `roberta-base` | Baseline fine-tune on norm classification |
| `roberta-base-tapt` | `roberta-base` | Task-Adaptive Pre-Training (TAPT) checkpoint |
| `roberta-large-classifier-v01` | `roberta-large` | Larger model fine-tune for higher capacity |
| `roberta-tapt-classifier-v01` | `roberta-base-tapt` | Fine-tuned on top of the TAPT checkpoint |
---
## Usage — `roberta-base-classifier-v01`
### Load the model
```python
from huggingface_hub import snapshot_download
from transformers import RobertaForSequenceClassification, RobertaTokenizer
import torch
# Download from HF Hub
snapshot_download(
repo_id="anik-owl/roberta_norm_classifier",
allow_patterns="roberta-base-classifier-v01/*",
local_dir="./artifacts",
)
# Load model + tokenizer
model = RobertaForSequenceClassification.from_pretrained(
"./artifacts/roberta-base-classifier-v01",
num_labels=2,
)
tokenizer = RobertaTokenizer.from_pretrained("FacebookAI/roberta-base")
model.eval()
```
### Inference
```python
def predict(text: str, model, tokenizer, threshold: float = 0.5):
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
padding=True,
max_length=256,
)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
prob_norm = probs[0][1].item()
return {
"label": "NORMATIVE" if prob_norm >= threshold else "NOT NORMATIVE",
"score": round(prob_norm, 4),
}
# Example
text = "People should always greet elders with respect."
result = predict(text, model, tokenizer)
print(result)
# {'label': 'NORMATIVE', 'score': 0.9341}
```
### Labels
| ID | Label |
|---|---|
| 0 | NOT NORMATIVE |
| 1 | NORMATIVE |
---
## Intended use
These models are intended for research on computational social science, normative reasoning, and deontic language detection. They were developed as part of a thesis project on identifying normative statements in natural language.
**Not intended for** high-stakes automated decision-making without human review.
---
## Limitations
- Trained on a specific dataset of normative statements — may not generalise to all domains or languages
- Short, context-free sentences may be harder to classify accurately
- Models may reflect biases present in the training data
---
## Citation
If you use these models in your work, please cite this repository:
```bibtex
@misc{anik-owl-normclsf,
author = {anik-owl},
title = {Normative Statement Classifier — RoBERTa Fine-tunes},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/anik-owl/roberta_norm_classifier}},
}
```