| --- |
| language: en |
| license: mit |
| tags: |
| - text-classification |
| - roberta |
| - normativity |
| - deontic-logic |
| - social-norms |
| base_model: |
| - FacebookAI/roberta-base |
| - FacebookAI/roberta-large |
| datasets: |
| - SALT-NLP/CultureBank |
| --- |
| |
| # Normative Statement Classifier — RoBERTa Fine-tunes |
|
|
| A collection of fine-tuned RoBERTa models for detecting **normative statements** in text — sentences and documents that express social norms, obligations, prohibitions, or moral judgments (e.g. *"people should remove their shoes before entering"*). |
|
|
| > Github link for the full project: [Git](https://github.com/AnikMallick/norm-classifier) |
|
|
| --- |
|
|
| ## Models in this repository |
|
|
| | Subfolder | Base | Description | |
| |---|---|---| |
| | `roberta-base-classifier-v01` | `roberta-base` | Baseline fine-tune on norm classification | |
| | `roberta-base-tapt` | `roberta-base` | Task-Adaptive Pre-Training (TAPT) checkpoint | |
| | `roberta-large-classifier-v01` | `roberta-large` | Larger model fine-tune for higher capacity | |
| | `roberta-tapt-classifier-v01` | `roberta-base-tapt` | Fine-tuned on top of the TAPT checkpoint | |
|
|
| --- |
|
|
| ## Usage — `roberta-base-classifier-v01` |
|
|
| ### Load the model |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| from transformers import RobertaForSequenceClassification, RobertaTokenizer |
| import torch |
| |
| # Download from HF Hub |
| snapshot_download( |
| repo_id="anik-owl/roberta_norm_classifier", |
| allow_patterns="roberta-base-classifier-v01/*", |
| local_dir="./artifacts", |
| ) |
| |
| # Load model + tokenizer |
| model = RobertaForSequenceClassification.from_pretrained( |
| "./artifacts/roberta-base-classifier-v01", |
| num_labels=2, |
| ) |
| tokenizer = RobertaTokenizer.from_pretrained("FacebookAI/roberta-base") |
| |
| model.eval() |
| ``` |
|
|
| ### Inference |
|
|
| ```python |
| def predict(text: str, model, tokenizer, threshold: float = 0.5): |
| inputs = tokenizer( |
| text, |
| return_tensors="pt", |
| truncation=True, |
| padding=True, |
| max_length=256, |
| ) |
| |
| with torch.no_grad(): |
| logits = model(**inputs).logits |
| |
| probs = torch.softmax(logits, dim=-1) |
| prob_norm = probs[0][1].item() |
| |
| return { |
| "label": "NORMATIVE" if prob_norm >= threshold else "NOT NORMATIVE", |
| "score": round(prob_norm, 4), |
| } |
| |
| |
| # Example |
| text = "People should always greet elders with respect." |
| result = predict(text, model, tokenizer) |
| print(result) |
| # {'label': 'NORMATIVE', 'score': 0.9341} |
| ``` |
|
|
| ### Labels |
|
|
| | ID | Label | |
| |---|---| |
| | 0 | NOT NORMATIVE | |
| | 1 | NORMATIVE | |
|
|
| --- |
|
|
| ## Intended use |
|
|
| These models are intended for research on computational social science, normative reasoning, and deontic language detection. They were developed as part of a thesis project on identifying normative statements in natural language. |
|
|
| **Not intended for** high-stakes automated decision-making without human review. |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - Trained on a specific dataset of normative statements — may not generalise to all domains or languages |
| - Short, context-free sentences may be harder to classify accurately |
| - Models may reflect biases present in the training data |
|
|
| --- |
|
|
| ## Citation |
|
|
| If you use these models in your work, please cite this repository: |
|
|
| ```bibtex |
| @misc{anik-owl-normclsf, |
| author = {anik-owl}, |
| title = {Normative Statement Classifier — RoBERTa Fine-tunes}, |
| year = {2026}, |
| publisher = {Hugging Face}, |
| howpublished = {\url{https://huggingface.co/anik-owl/roberta_norm_classifier}}, |
| } |
| ``` |