File size: 3,434 Bytes

---
language: en
license: mit
tags:
- text-classification
- roberta
- normativity
- deontic-logic
- social-norms
base_model:
- FacebookAI/roberta-base
- FacebookAI/roberta-large
datasets:
- SALT-NLP/CultureBank
---

# Normative Statement Classifier — RoBERTa Fine-tunes

A collection of fine-tuned RoBERTa models for detecting **normative statements** in text — sentences and documents that express social norms, obligations, prohibitions, or moral judgments (e.g. *"people should remove their shoes before entering"*).

> Github link for the full project: [Git](https://github.com/AnikMallick/norm-classifier)

---

## Models in this repository

| Subfolder | Base | Description |
|---|---|---|
| `roberta-base-classifier-v01` | `roberta-base` | Baseline fine-tune on norm classification |
| `roberta-base-tapt` | `roberta-base` | Task-Adaptive Pre-Training (TAPT) checkpoint |
| `roberta-large-classifier-v01` | `roberta-large` | Larger model fine-tune for higher capacity |
| `roberta-tapt-classifier-v01` | `roberta-base-tapt` | Fine-tuned on top of the TAPT checkpoint |

---

## Usage — `roberta-base-classifier-v01`

### Load the model

```python
from huggingface_hub import snapshot_download
from transformers import RobertaForSequenceClassification, RobertaTokenizer
import torch

# Download from HF Hub
snapshot_download(
    repo_id="anik-owl/roberta_norm_classifier",
    allow_patterns="roberta-base-classifier-v01/*",
    local_dir="./artifacts",
)

# Load model + tokenizer
model = RobertaForSequenceClassification.from_pretrained(
    "./artifacts/roberta-base-classifier-v01",
    num_labels=2,
)
tokenizer = RobertaTokenizer.from_pretrained("FacebookAI/roberta-base")

model.eval()
```

### Inference

```python
def predict(text: str, model, tokenizer, threshold: float = 0.5):
    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        padding=True,
        max_length=256,
    )

    with torch.no_grad():
        logits = model(**inputs).logits

    probs = torch.softmax(logits, dim=-1)
    prob_norm = probs[0][1].item()

    return {
        "label": "NORMATIVE" if prob_norm >= threshold else "NOT NORMATIVE",
        "score": round(prob_norm, 4),
    }


# Example
text = "People should always greet elders with respect."
result = predict(text, model, tokenizer)
print(result)
# {'label': 'NORMATIVE', 'score': 0.9341}
```

### Labels

| ID | Label |
|---|---|
| 0 | NOT NORMATIVE |
| 1 | NORMATIVE |

---

## Intended use

These models are intended for research on computational social science, normative reasoning, and deontic language detection. They were developed as part of a thesis project on identifying normative statements in natural language.

**Not intended for** high-stakes automated decision-making without human review.

---

## Limitations

- Trained on a specific dataset of normative statements — may not generalise to all domains or languages
- Short, context-free sentences may be harder to classify accurately
- Models may reflect biases present in the training data

---

## Citation

If you use these models in your work, please cite this repository:

```bibtex
@misc{anik-owl-normclsf,
  author       = {anik-owl},
  title        = {Normative Statement Classifier — RoBERTa Fine-tunes},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/anik-owl/roberta_norm_classifier}},
}
```