anik-owl commited on
Commit
b13d804
·
verified ·
1 Parent(s): 67dea7d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +130 -3
README.md CHANGED
@@ -1,3 +1,130 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - text-classification
6
+ - roberta
7
+ - normativity
8
+ - deontic-logic
9
+ - social-norms
10
+ base_model:
11
+ - FacebookAI/roberta-base
12
+ - FacebookAI/roberta-large
13
+ datasets:
14
+ - SALT-NLP/CultureBank
15
+ ---
16
+
17
+ # Normative Statement Classifier — RoBERTa Fine-tunes
18
+
19
+ A collection of fine-tuned RoBERTa models for detecting **normative statements** in text — sentences and documents that express social norms, obligations, prohibitions, or moral judgments (e.g. *"people should remove their shoes before entering"*).
20
+
21
+ > Github link for the full project: [Git](https://github.com/AnikMallick/norm-classifier)
22
+
23
+ ---
24
+
25
+ ## Models in this repository
26
+
27
+ | Subfolder | Base | Description |
28
+ |---|---|---|
29
+ | `roberta-base-classifier-v01` | `roberta-base` | Baseline fine-tune on norm classification |
30
+ | `roberta-base-tapt` | `roberta-base` | Task-Adaptive Pre-Training (TAPT) checkpoint |
31
+ | `roberta-large-classifier-v01` | `roberta-large` | Larger model fine-tune for higher capacity |
32
+ | `roberta-tapt-classifier-v01` | TAPT checkpoint | Fine-tuned on top of the TAPT checkpoint |
33
+
34
+ ---
35
+
36
+ ## Usage — `roberta-base-classifier-v01`
37
+
38
+ ### Load the model
39
+
40
+ ```python
41
+ from huggingface_hub import snapshot_download
42
+ from transformers import RobertaForSequenceClassification, RobertaTokenizer
43
+ import torch
44
+
45
+ # Download from HF Hub
46
+ snapshot_download(
47
+ repo_id="anik-owl/roberta_norm_classifier",
48
+ allow_patterns="roberta-base-classifier-v01/*",
49
+ local_dir="./artifacts",
50
+ )
51
+
52
+ # Load model + tokenizer
53
+ model = RobertaForSequenceClassification.from_pretrained(
54
+ "./artifacts/roberta-base-classifier-v01",
55
+ num_labels=2,
56
+ )
57
+ tokenizer = RobertaTokenizer.from_pretrained("FacebookAI/roberta-base")
58
+
59
+ model.eval()
60
+ ```
61
+
62
+ ### Inference
63
+
64
+ ```python
65
+ def predict(text: str, model, tokenizer, threshold: float = 0.5):
66
+ inputs = tokenizer(
67
+ text,
68
+ return_tensors="pt",
69
+ truncation=True,
70
+ padding=True,
71
+ max_length=256,
72
+ )
73
+
74
+ with torch.no_grad():
75
+ logits = model(**inputs).logits
76
+
77
+ probs = torch.softmax(logits, dim=-1)
78
+ prob_norm = probs[0][1].item()
79
+
80
+ return {
81
+ "label": "NORMATIVE" if prob_norm >= threshold else "NOT NORMATIVE",
82
+ "score": round(prob_norm, 4),
83
+ }
84
+
85
+
86
+ # Example
87
+ text = "People should always greet elders with respect."
88
+ result = predict(text, model, tokenizer)
89
+ print(result)
90
+ # {'label': 'NORMATIVE', 'score': 0.9341}
91
+ ```
92
+
93
+ ### Labels
94
+
95
+ | ID | Label |
96
+ |---|---|
97
+ | 0 | NOT NORMATIVE |
98
+ | 1 | NORMATIVE |
99
+
100
+ ---
101
+
102
+ ## Intended use
103
+
104
+ These models are intended for research on computational social science, normative reasoning, and deontic language detection. They were developed as part of a thesis project on identifying normative statements in natural language.
105
+
106
+ **Not intended for** high-stakes automated decision-making without human review.
107
+
108
+ ---
109
+
110
+ ## Limitations
111
+
112
+ - Trained on a specific dataset of normative statements — may not generalise to all domains or languages
113
+ - Short, context-free sentences may be harder to classify accurately
114
+ - Models may reflect biases present in the training data
115
+
116
+ ---
117
+
118
+ ## Citation
119
+
120
+ If you use these models in your work, please cite this repository:
121
+
122
+ ```bibtex
123
+ @misc{anik-owl-normclsf,
124
+ author = {anik-owl},
125
+ title = {Normative Statement Classifier — RoBERTa Fine-tunes},
126
+ year = {2026},
127
+ publisher = {Hugging Face},
128
+ howpublished = {\url{https://huggingface.co/anik-owl/roberta_norm_classifier}},
129
+ }
130
+ ```