Upload README.md with huggingface_hub

b43914a verified about 1 month ago

3.85 kB

	---
	base_model: cardiffnlp/twitter-roberta-base-sentiment-latest
	datasets:
	- acidtib/reddit-mood
	language:
	- en
	library_name: transformers.js
	license: cc-by-4.0
	pipeline_tag: text-classification
	tags:
	- sentiment
	- reddit
	- mood
	- onnx
	- text-classification
	---
	# Reddit mood classifier

	A 3-class sentiment classifier for Reddit comments, fine-tuned from
	[`cardiffnlp/twitter-roberta-base-sentiment-latest`](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest).

	Output classes: `negative` / `neutral` / `positive`.

	Trained on the [`acidtib/reddit-mood`](https://huggingface.co/datasets/acidtib/reddit-mood)
	dataset. Evaluate on your own corpus before relying on it outside the
	training domain.

	## Labels

	\| Label \| Numeric score \| Meaning \|
	\|---\|---:\|---\|
	\| `negative` \| 25 \| Anywhere on the negative spectrum: complaints, sarcasm, disappointment, balance gripes, bug-report annoyance, scorched-earth rage, personal attacks on devs, quit threats \|
	\| `neutral` \| 60 \| Factual, banter, parody/hyperbole, in-domain references without strong real-world emotion \|
	\| `positive` \| 90 \| Genuine positive, hype, love, excitement \|

	The numeric scores are arbitrary anchors that let you average labels
	into a single 0-100 mood score per group of comments. Pick your own
	mapping if these don't fit.

	## Usage

	### transformers.js (Node / browser)

	```js
	import { pipeline } from "@huggingface/transformers";

	const classify = await pipeline(
	"text-classification",
	"acidtib/reddit-mood-classifier",
	{ dtype: "q8" } // load model_quantized.onnx (~25MB, CPU-friendly)
	);

	const out = await classify("they nerfed it again, it's over");
	// [{ label: "negative", score: 0.81 }]
	```

	### Python (transformers + onnxruntime)

	```python
	from transformers import AutoTokenizer
	import onnxruntime as ort

	tokenizer = AutoTokenizer.from_pretrained("acidtib/reddit-mood-classifier")
	session = ort.InferenceSession(
	"onnx/model_quantized.onnx",
	providers=["CPUExecutionProvider"],
	)
	# tokenize, run argmax, softmax for confidence.
	```

	## Files

	```
	config.json HF model config (id2label, label2id)
	tokenizer.json + vocab.json + ... HF tokenizer files (RoBERTa BPE)
	onnx/model.onnx full-precision ONNX (~500MB)
	onnx/model_quantized.onnx int8 dynamic quantized ONNX (~120MB) -
	this is what production inference loads
	ort_config.json ONNX Runtime quantization metadata
	```

	## Evaluation

	Held-out test set (962 rows, never seen by trainer) at 2026-05-04T03:53:59.880132+00:00.

	Macro-F1: `0.7259` on 9612-row corpus.

	\| Label \| Test F1 \|
	\|---\|---:\|
	\| negative \| 0.672 \|
	\| neutral \| 0.836 \|
	\| positive \| 0.669 \|


	Metrics are recomputed from the actually-quantized ONNX file (the one in
	this repo), not the unquantized PyTorch checkpoint - so the numbers
	reflect what production inference will see.

	## Training

	- Base: `cardiffnlp/twitter-roberta-base-sentiment-latest` (RoBERTa-base, 124M params)
	- Head: warm-started from the base model's existing 3-class sentiment head (label names + id order match)
	- Loss: Class-weighted cross-entropy with sqrt-inverse-frequency weights and label smoothing 0.1
	- Optimizer: AdamW with layer-wise LR decay (0.9), lr=2e-5, weight_decay=0.01
	- Schedule: Up to 4 epochs with `EarlyStoppingCallback(patience=2)` on val macro-F1
	- Split: Stratified 80/10/10 train/val/test, seed=42
	- Quantization: int8 dynamic (AVX2 CPU), via `optimum.onnxruntime`

	## Limitations

	- Labels reflect English-language Reddit conversation conventions.
	Sarcasm, in-domain aggression, and parody are inherently ambiguous
	and contribute most of the model's errors.
	- Out-of-domain performance is unevaluated - run your own holdout
	before depending on it on a different community.