anik-owl
/

roberta_norm_classifier

Text Classification

Model card Files Files and versions

roberta_norm_classifier / README.md

anik-owl's picture

Update README.md

c647e03 verified 22 days ago

|

history blame contribute delete

3.43 kB

	---
	language: en
	license: mit
	tags:
	- text-classification
	- roberta
	- normativity
	- deontic-logic
	- social-norms
	base_model:
	- FacebookAI/roberta-base
	- FacebookAI/roberta-large
	datasets:
	- SALT-NLP/CultureBank
	---

	# Normative Statement Classifier — RoBERTa Fine-tunes

	A collection of fine-tuned RoBERTa models for detecting normative statements in text — sentences and documents that express social norms, obligations, prohibitions, or moral judgments (e.g. "people should remove their shoes before entering").

	> Github link for the full project: [Git](https://github.com/AnikMallick/norm-classifier)

	---

	## Models in this repository

	\| Subfolder \| Base \| Description \|
	\|---\|---\|---\|
	\| `roberta-base-classifier-v01` \| `roberta-base` \| Baseline fine-tune on norm classification \|
	\| `roberta-base-tapt` \| `roberta-base` \| Task-Adaptive Pre-Training (TAPT) checkpoint \|
	\| `roberta-large-classifier-v01` \| `roberta-large` \| Larger model fine-tune for higher capacity \|
	\| `roberta-tapt-classifier-v01` \| `roberta-base-tapt` \| Fine-tuned on top of the TAPT checkpoint \|

	---

	## Usage — `roberta-base-classifier-v01`

	### Load the model

	```python
	from huggingface_hub import snapshot_download
	from transformers import RobertaForSequenceClassification, RobertaTokenizer
	import torch

	# Download from HF Hub
	snapshot_download(
	repo_id="anik-owl/roberta_norm_classifier",
	allow_patterns="roberta-base-classifier-v01/*",
	local_dir="./artifacts",
	)

	# Load model + tokenizer
	model = RobertaForSequenceClassification.from_pretrained(
	"./artifacts/roberta-base-classifier-v01",
	num_labels=2,
	)
	tokenizer = RobertaTokenizer.from_pretrained("FacebookAI/roberta-base")

	model.eval()
	```

	### Inference

	```python
	def predict(text: str, model, tokenizer, threshold: float = 0.5):
	inputs = tokenizer(
	text,
	return_tensors="pt",
	truncation=True,
	padding=True,
	max_length=256,
	)

	with torch.no_grad():
	logits = model(**inputs).logits

	probs = torch.softmax(logits, dim=-1)
	prob_norm = probs[0][1].item()

	return {
	"label": "NORMATIVE" if prob_norm >= threshold else "NOT NORMATIVE",
	"score": round(prob_norm, 4),
	}


	# Example
	text = "People should always greet elders with respect."
	result = predict(text, model, tokenizer)
	print(result)
	# {'label': 'NORMATIVE', 'score': 0.9341}
	```

	### Labels

	\| ID \| Label \|
	\|---\|---\|
	\| 0 \| NOT NORMATIVE \|
	\| 1 \| NORMATIVE \|

	---

	## Intended use

	These models are intended for research on computational social science, normative reasoning, and deontic language detection. They were developed as part of a thesis project on identifying normative statements in natural language.

	Not intended for high-stakes automated decision-making without human review.

	---

	## Limitations

	- Trained on a specific dataset of normative statements — may not generalise to all domains or languages
	- Short, context-free sentences may be harder to classify accurately
	- Models may reflect biases present in the training data

	---

	## Citation

	If you use these models in your work, please cite this repository:

	```bibtex
	@misc{anik-owl-normclsf,
	author = {anik-owl},
	title = {Normative Statement Classifier — RoBERTa Fine-tunes},
	year = {2026},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/anik-owl/roberta_norm_classifier}},
	}
	```