acidtib's picture
Upload README.md with huggingface_hub
b43914a verified
---
base_model: cardiffnlp/twitter-roberta-base-sentiment-latest
datasets:
- acidtib/reddit-mood
language:
- en
library_name: transformers.js
license: cc-by-4.0
pipeline_tag: text-classification
tags:
- sentiment
- reddit
- mood
- onnx
- text-classification
---
# Reddit mood classifier
A 3-class sentiment classifier for Reddit comments, fine-tuned from
[`cardiffnlp/twitter-roberta-base-sentiment-latest`](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest).
Output classes: `negative` / `neutral` / `positive`.
Trained on the [`acidtib/reddit-mood`](https://huggingface.co/datasets/acidtib/reddit-mood)
dataset. Evaluate on your own corpus before relying on it outside the
training domain.
## Labels
| Label | Numeric score | Meaning |
|---|---:|---|
| `negative` | 25 | Anywhere on the negative spectrum: complaints, sarcasm, disappointment, balance gripes, bug-report annoyance, scorched-earth rage, personal attacks on devs, quit threats |
| `neutral` | 60 | Factual, banter, parody/hyperbole, in-domain references without strong real-world emotion |
| `positive` | 90 | Genuine positive, hype, love, excitement |
The numeric scores are arbitrary anchors that let you average labels
into a single 0-100 mood score per group of comments. Pick your own
mapping if these don't fit.
## Usage
### transformers.js (Node / browser)
```js
import { pipeline } from "@huggingface/transformers";
const classify = await pipeline(
"text-classification",
"acidtib/reddit-mood-classifier",
{ dtype: "q8" } // load model_quantized.onnx (~25MB, CPU-friendly)
);
const out = await classify("they nerfed it again, it's over");
// [{ label: "negative", score: 0.81 }]
```
### Python (transformers + onnxruntime)
```python
from transformers import AutoTokenizer
import onnxruntime as ort
tokenizer = AutoTokenizer.from_pretrained("acidtib/reddit-mood-classifier")
session = ort.InferenceSession(
"onnx/model_quantized.onnx",
providers=["CPUExecutionProvider"],
)
# tokenize, run argmax, softmax for confidence.
```
## Files
```
config.json HF model config (id2label, label2id)
tokenizer.json + vocab.json + ... HF tokenizer files (RoBERTa BPE)
onnx/model.onnx full-precision ONNX (~500MB)
onnx/model_quantized.onnx int8 dynamic quantized ONNX (~120MB) -
this is what production inference loads
ort_config.json ONNX Runtime quantization metadata
```
## Evaluation
Held-out test set (962 rows, never seen by trainer) at 2026-05-04T03:53:59.880132+00:00.
**Macro-F1:** `0.7259` on 9612-row corpus.
| Label | Test F1 |
|---|---:|
| negative | 0.672 |
| neutral | 0.836 |
| positive | 0.669 |
Metrics are recomputed from the actually-quantized ONNX file (the one in
this repo), not the unquantized PyTorch checkpoint - so the numbers
reflect what production inference will see.
## Training
- **Base**: `cardiffnlp/twitter-roberta-base-sentiment-latest` (RoBERTa-base, 124M params)
- **Head**: warm-started from the base model's existing 3-class sentiment head (label names + id order match)
- **Loss**: Class-weighted cross-entropy with sqrt-inverse-frequency weights and label smoothing 0.1
- **Optimizer**: AdamW with layer-wise LR decay (0.9), lr=2e-5, weight_decay=0.01
- **Schedule**: Up to 4 epochs with `EarlyStoppingCallback(patience=2)` on val macro-F1
- **Split**: Stratified 80/10/10 train/val/test, seed=42
- **Quantization**: int8 dynamic (AVX2 CPU), via `optimum.onnxruntime`
## Limitations
- Labels reflect English-language Reddit conversation conventions.
Sarcasm, in-domain aggression, and parody are inherently ambiguous
and contribute most of the model's errors.
- Out-of-domain performance is unevaluated - run your own holdout
before depending on it on a different community.