Mozat
/

how-affect-v1

+---
+license: other
+library_name: transformers
+base_model: distilbert-base-uncased
+tags:
+  - text-classification
+  - affect
+  - emotion
+  - distilbert
+language:
+  - en
+---
+# how-affect-v1 — Bridge-Grounded Affect Detector
+A DistilBERT-based affect-valence classifier fine-tuned on **non-circular**
+author-narrated affect labels (mined from public-domain novel narration via
+BookNLP), rather than on LLM-generated personality scores.
+## Why this exists
+Production personality / emotion classifiers in companion AI are commonly trained
+**on LLM labels** (e.g. Claude/GPT scores). Evaluation against those same LLM
+labels is circular — the model only learns to imitate the labeling LLM. We
+needed a HOW (affect) detector grounded in **independent human-written signal**
+about how characters speak. Solution: harvest dialogue-tag adverbs + WordNet
+emotion supersenses from BookNLP-processed novels (~1000 books, 25k labeled
+quotes), bind them to the speaker's actual utterances, and train a probe.
+## Metrics
+Held-out test set (5,971 quotes, balanced neg/pos author affect):
+| Model | Held-out AUC |
+|---|---|
+| Existing circular "emotion" dim (177-dim model trained on Claude scores) | 0.557 |
+| Frozen-embedding probe (sentence-transformer + linear head) | 0.637 |
+| **This model — DistilBERT end-to-end on bridge labels** | **0.678** |
+Honest ceiling: ~0.68 is real but modest. Narrated affect ("said bitterly")
+often lives in prosody, not lexical content, so text-only affect detection
+has a structural ceiling. A voice/prosody channel is the path to higher AUC.
+## Files
+- `model.pt` — full state-dict: DistilBERT encoder + mean-pool + Linear(hidden→1) head.
+- `metrics.json` — final held-out AUC + baseline comparison.
+## Usage
+The head is custom (DistilBERT + mean-pool + 1-logit), so you can't use
+`AutoModelForSequenceClassification.from_pretrained` directly. Load like this:
+```python
+import torch
+from transformers import AutoTokenizer, AutoModel
+class AffectNet(torch.nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.enc = AutoModel.from_pretrained("distilbert-base-uncased")
+        self.head = torch.nn.Linear(self.enc.config.hidden_size, 1)
+    def forward(self, ids, mask):
+        h = self.enc(input_ids=ids, attention_mask=mask).last_hidden_state
+        m = mask.unsqueeze(-1).float()
+        pooled = (h * m).sum(1) / m.sum(1).clamp(min=1e-6)
+        return self.head(pooled).squeeze(1)
+tok = AutoTokenizer.from_pretrained("distilbert-base-uncased")
+model = AffectNet()
+model.load_state_dict(torch.load("model.pt", map_location="cpu"))
+model.eval()
+text = "I can't bear this any longer."
+enc = tok(text, padding="max_length", truncation=True, max_length=48, return_tensors="pt")
+with torch.no_grad():
+    valence = torch.sigmoid(model(enc["input_ids"], enc["attention_mask"]))[0].item()
+print(valence)  # ~1.0 = negative/distressed affect, ~0.0 = positive
+```
+## Training data (non-circular)
+Bridge corpus from ~1000 BookNLP-processed novels (`corpus/booknlp_output/`):
+for each character quote, the narration window (±7 tokens around the quote) was
+scanned for emotion supersense spans (`verb.emotion`, `noun.feeling`) and
+manner adverbs anchored to a speech verb ("said *bitterly*"). Quotes mapped
+to net-negative vs net-positive author affect → 17,749 neg / 16,375 pos
+balanced labels (29,852 total used, 23,881 train / 5,971 test).
+## Architecture
+- Base encoder: `distilbert-base-uncased` (~66M params).
+- Head: `Dropout-free Linear(hidden_size, 1)` over mean-pooled token embeddings.
+- Loss: `BCEWithLogitsLoss` on binary affect-valence.
+- Trained 1-2 epochs on CPU (best epoch saved by held-out AUC; early-stopped when AUC stopped improving).
+- Max input length: 48 tokens (quotes are short).
+## License
+Trained on derivatives of public-domain (Project Gutenberg) novels processed
+via BookNLP. The model weights are released for research use; please consult
+your jurisdiction's rules around derivative works for production deployment.