jimnoneill commited on
Commit
d6e8792
·
verified ·
1 Parent(s): c033ee3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +92 -20
README.md CHANGED
@@ -6,43 +6,115 @@ tags:
6
  - political-tweets
7
  - bayesian-classifier
8
  - digital-phenotyping
 
9
  pipeline_tag: text-classification
10
  ---
11
 
12
- # X-Box Compulsion Classifier
13
 
14
- Bayesian classifier for detecting compulsive social media usage patterns
15
- in political Twitter/X accounts.
 
16
 
17
  ## Architecture
18
 
19
- - **12 classification heads**: 6 CardiffNLP (sentiment, emotion, offensive, irony, hate, toxicity) + 6 custom SetFit (ragebait, tribal signal, performative outrage, epistemic manipulation, engagement bait, agency language)
20
- - **Compulsion signatures**: Burstiness (Goh-Barabasi), time-of-day entropy, Hawkes self-excitation, night intensity, weekend ratio
21
- - **Bayesian posterior**: Calibrated P(compulsive | features) with 95% credible intervals
22
- - **Disorder baseline**: DSM-5-adjacent criteria mapping with clinical thresholds
 
 
 
23
 
24
  ## Validation
25
 
26
- - LOO cross-validation: F1=1.000, AUC=1.000 on 16-account ground truth cohort
27
- - Ground truth: 8 known-compulsive accounts (Trump Android, Mike Lee, Cruz, Hawley, Blackburn, Rubio, Murphy) + 8 known-strategic accounts (Feinstein, Risch, Tester, etc.)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
- ## Feature Importance
 
 
 
 
 
 
 
30
 
31
- | Feature | Mean |LLR| |
32
- |---------|---------|
33
- | Night intensity (00-05 UTC) | 28.1 |
34
- | Time-of-day entropy | 8.0 |
35
- | Burstiness B parameter | 4.8 |
36
- | Hawkes self-excitation n* | 4.6 |
37
- | Weekend ratio | 0.05 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
  ## Theoretical Framework
40
 
41
- Inspired by Recovery Viability Theory (Kepner, White, O'Neill):
42
- - Logit-bounded state space
43
  - Cusp catastrophe dynamics for sudden behavioral transitions
44
  - Critical slowing down as early warning signals
45
 
 
 
 
 
 
 
 
 
 
46
  ## Citation
47
 
48
- Research by O'Neill Lab. Not for clinical diagnosis.
 
 
 
 
 
 
 
 
6
  - political-tweets
7
  - bayesian-classifier
8
  - digital-phenotyping
9
+ - toxicity-index
10
  pipeline_tag: text-classification
11
  ---
12
 
13
+ # X-Box Compulsion & Toxicity Index Classifier
14
 
15
+ Bayesian temporal phenotyping + 12-head text classification pipeline for detecting
16
+ compulsive social media usage patterns and computing the Toxicity Index (TI) for
17
+ political Twitter/X accounts.
18
 
19
  ## Architecture
20
 
21
+ **Temporal Model**: Calibrated logistic regression on 5 compulsion signatures
22
+ (burstiness, time-of-day entropy, Hawkes self-excitation, night intensity, weekend ratio).
23
+
24
+ **Text Classification**: 12 heads producing the per-tweet Toxicity Index.
25
+
26
+ **Toxicity Index**: TI = mean of 8 binary negative-behavior flags per tweet, bounded [0,1].
27
+ TI=0 means a clean informational tweet; TI=1 means every negative flag is active.
28
 
29
  ## Validation
30
 
31
+ **Compulsion Model** (n=32, independent ground truth):
32
+ - Spearman r = 0.912 (permutation p=0.001, bootstrap 95% CI [0.845, 0.965])
33
+ - AUC = 0.933 (permutation p=0.003, bootstrap 95% CI [0.928, 1.000])
34
+ - Repeated 5-fold (x20): AUC = 0.953 +/- 0.076
35
+ - Brier score: 0.101
36
+
37
+ **Text Classification Label Reliability** (test-retest, n=75):
38
+ - Ragebait: Pearson r=0.889, Cohen kappa=0.479
39
+ - Tribal signal: Pearson r=0.862, Cohen kappa=0.730
40
+ - Performative outrage: Pearson r=0.777, Cohen kappa=0.525
41
+
42
+ ## Per-Class Performance (12 Classification Heads)
43
+
44
+ ### Off-the-Shelf (CardiffNLP Twitter-RoBERTa, ~125M params each)
45
+
46
+ | Head | Model ID | Classes | Training Data |
47
+ |------|----------|---------|--------------|
48
+ | Sentiment | cardiffnlp/twitter-roberta-base-sentiment-latest | negative, neutral, positive | TweetEval benchmark |
49
+ | Emotion | cardiffnlp/twitter-roberta-base-emotion | anger, joy, optimism, sadness | TweetEval |
50
+ | Offensive | cardiffnlp/twitter-roberta-base-offensive | not-offensive, offensive | TweetEval |
51
+ | Irony | cardiffnlp/twitter-roberta-base-irony | non-irony, irony | TweetEval |
52
+ | Hate | cardiffnlp/twitter-roberta-base-hate-multiclass-latest | not-hate, + 6 subtypes | 13 hate-speech datasets |
53
+ | Toxicity | s-nlp/roberta_toxicity_classifier | neutral, toxic | 3 Jigsaw competitions (AUC 0.98) |
54
+
55
+ CardiffNLP models are pre-trained on 124M tweets. See the TweetEval benchmark
56
+ (Barbieri et al., 2020) for per-class F1/P/R on the standard evaluation sets.
57
+
58
+ ### Custom-Trained (SetFit, all-mpnet-base-v2 backbone, ~109M params each)
59
+
60
+ Trained on 4,121 LLM-labeled tweets from 14 accounts (7 Democrat, 7 Republican).
61
+ Evaluated on 20% held-out test set.
62
 
63
+ | Head | F1 | Precision | Recall | Training Examples | Description |
64
+ |------|----|-----------|--------|-------------------|-------------|
65
+ | Ragebait | 0.800 | 0.82 | 0.78 | 300 (150+150) | Content designed to provoke outrage |
66
+ | Tribal signal | 0.825 | 0.84 | 0.81 | 400 (200+200) | Us-vs-them, in-group/out-group framing |
67
+ | Performative outrage | 0.850 | 0.87 | 0.83 | 400 (200+200) | Theatrical outrage vs genuine concern |
68
+ | Epistemic manipulation | 0.800 | 0.81 | 0.79 | 300 (150+150) | Cherry-picking, straw-manning, false equiv. |
69
+ | Engagement bait | 0.800 | 0.83 | 0.77 | 400 (200+200) | Polls, CTAs, rhetorical questions |
70
+ | Agency language | 0.838 | 0.85 | 0.83 | 400 (200+200) | Active/agentic (1) vs passive/victimhood (0) |
71
 
72
+ ### Toxicity Index Components
73
+
74
+ The per-tweet Toxicity Index is computed as:
75
+
76
+ ```
77
+ TI = mean(flag_offensive, flag_toxic, flag_negative_sentiment,
78
+ flag_anger, flag_irony, flag_ragebait, flag_tribal,
79
+ flag_performative)
80
+ ```
81
+
82
+ Where each flag is binary (0 or 1) based on the corresponding classifier threshold.
83
+ TI_senator = mean(TI) across all tweets in the archive.
84
+
85
+ ## Compulsion Signature Features
86
+
87
+ | Feature | Coefficient | Description |
88
+ |---------|------------|-------------|
89
+ | Time-of-day entropy | +1.258 | Shannon entropy of hourly posting distribution (bits) |
90
+ | Hawkes n* | +0.922 | Self-excitation branching ratio |
91
+ | Burstiness B | +0.837 | Goh-Barabasi inter-event time parameter |
92
+ | Night intensity | +0.584 | Share of posts 00:00-05:59 UTC |
93
+ | Weekend ratio | +0.204 | Weekend/weekday posting rate ratio |
94
 
95
  ## Theoretical Framework
96
 
97
+ Inspired by Recovery Viability Theory (Kepner, White, & O'Neill, 2026):
98
+ - Logit-bounded state space for natural [0,1] constraints
99
  - Cusp catastrophe dynamics for sudden behavioral transitions
100
  - Critical slowing down as early warning signals
101
 
102
+ ## Files
103
+
104
+ - `bayesian_model_results.json` - Fitted model parameters
105
+ - `calibrated_model_v2.json` - V2 validation with independent ground truth
106
+ - `cohort_v2_results.csv` - 32-account ground truth cohort
107
+ - `cohort_signatures.csv` - Ground truth compulsion signatures
108
+ - `setfit_*/` - Trained SetFit classifier checkpoints (6 models)
109
+ - `xbox/` - Pipeline source code
110
+
111
  ## Citation
112
 
113
+ O'Neill, J., Brookes, J., et al. (2026). Detecting Compulsive Social Media Usage
114
+ Patterns in US Congressional Accounts: A Bayesian Temporal Phenotyping Approach.
115
+ Manuscript in preparation for International Journal of Drug Policy.
116
+
117
+ ## Ethics
118
+
119
+ This methodology cannot and should not be used for clinical diagnosis.
120
+ The Toxicity Index and compulsion probability are research instruments, not clinical assessments.