---
base_model: cardiffnlp/twitter-roberta-base-sentiment-latest
datasets:
- acidtib/reddit-mood
language:
- en
library_name: transformers.js
license: cc-by-4.0
pipeline_tag: text-classification
tags:
- sentiment
- reddit
- mood
- onnx
- text-classification
---
# Reddit mood classifier

A 3-class sentiment classifier for Reddit comments, fine-tuned from
[`cardiffnlp/twitter-roberta-base-sentiment-latest`](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest).

Output classes: `negative` / `neutral` / `positive`.

Trained on the [`acidtib/reddit-mood`](https://huggingface.co/datasets/acidtib/reddit-mood)
dataset. Evaluate on your own corpus before relying on it outside the
training domain.

## Labels

| Label | Numeric score | Meaning |
|---|---:|---|
| `negative` | 25 | Anywhere on the negative spectrum: complaints, sarcasm, disappointment, balance gripes, bug-report annoyance, scorched-earth rage, personal attacks on devs, quit threats |
| `neutral` | 60 | Factual, banter, parody/hyperbole, in-domain references without strong real-world emotion |
| `positive` | 90 | Genuine positive, hype, love, excitement |

The numeric scores are arbitrary anchors that let you average labels
into a single 0-100 mood score per group of comments. Pick your own
mapping if these don't fit.

## Usage

### transformers.js (Node / browser)

```js
import { pipeline } from "@huggingface/transformers";

const classify = await pipeline(
  "text-classification",
  "acidtib/reddit-mood-classifier",
  { dtype: "q8" } // load model_quantized.onnx (~25MB, CPU-friendly)
);

const out = await classify("they nerfed it again, it's over");
// [{ label: "negative", score: 0.81 }]
```

### Python (transformers + onnxruntime)

```python
from transformers import AutoTokenizer
import onnxruntime as ort

tokenizer = AutoTokenizer.from_pretrained("acidtib/reddit-mood-classifier")
session = ort.InferenceSession(
    "onnx/model_quantized.onnx",
    providers=["CPUExecutionProvider"],
)
# tokenize, run argmax, softmax for confidence.
```

## Files

```
config.json                       HF model config (id2label, label2id)
tokenizer.json + vocab.json + ... HF tokenizer files (RoBERTa BPE)
onnx/model.onnx                   full-precision ONNX (~500MB)
onnx/model_quantized.onnx         int8 dynamic quantized ONNX (~120MB) -
                                  this is what production inference loads
ort_config.json                   ONNX Runtime quantization metadata
```

## Evaluation

Held-out test set (962 rows, never seen by trainer) at 2026-05-04T03:53:59.880132+00:00.

**Macro-F1:** `0.7259` on 9612-row corpus.

| Label | Test F1 |
|---|---:|
| negative | 0.672 |
| neutral | 0.836 |
| positive | 0.669 |


Metrics are recomputed from the actually-quantized ONNX file (the one in
this repo), not the unquantized PyTorch checkpoint - so the numbers
reflect what production inference will see.

## Training

- **Base**: `cardiffnlp/twitter-roberta-base-sentiment-latest` (RoBERTa-base, 124M params)
- **Head**: warm-started from the base model's existing 3-class sentiment head (label names + id order match)
- **Loss**: Class-weighted cross-entropy with sqrt-inverse-frequency weights and label smoothing 0.1
- **Optimizer**: AdamW with layer-wise LR decay (0.9), lr=2e-5, weight_decay=0.01
- **Schedule**: Up to 4 epochs with `EarlyStoppingCallback(patience=2)` on val macro-F1
- **Split**: Stratified 80/10/10 train/val/test, seed=42
- **Quantization**: int8 dynamic (AVX2 CPU), via `optimum.onnxruntime`

## Limitations

- Labels reflect English-language Reddit conversation conventions.
  Sarcasm, in-domain aggression, and parody are inherently ambiguous
  and contribute most of the model's errors.
- Out-of-domain performance is unevaluated - run your own holdout
  before depending on it on a different community.