prem79 commited on
Commit
dfb2f37
·
verified ·
1 Parent(s): eb828ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +296 -30
README.md CHANGED
@@ -1,30 +1,296 @@
1
- # SENTRIX // Neural Sentiment Engine
2
-
3
- **SENTRIX** is a high-performance, mobile-first sentiment analysis dashboard. It utilizes a distributed hybrid architecture, combining a globally accessible Progressive Web App (PWA) with a dynamically hardware-accelerated local inference node.
4
-
5
- ![Status](https://img.shields.io/badge/Status-Active-brightgreen)
6
- ![Frontend](https://img.shields.io/badge/Frontend-GitHub_Pages-blue)
7
- ![Backend](https://img.shields.io/badge/Backend-Flask-orange)
8
- ![Model](https://img.shields.io/badge/Model-Hugging_Face-yellow)
9
-
10
- ## 🏗️ System Architecture
11
-
12
- SENTRIX operates across three decoupled layers. The model weights are hosted in the cloud, the UI is served globally, and the mathematical inference runs locally on the host machine.
13
-
14
- ```text
15
- [ Layer 1: Storage ] [ Layer 2: Frontend ] [ Layer 3: Backend ]
16
- Hugging Face Hub GitHub Pages Local Host (PC/Mac)
17
- (Model Weights) (Web Interface) (Flask Inference Node)
18
- │ │ │
19
- 1. Downloads weights │ 3. Sends POST /analyze │
20
- │ on first startup │ via local network IP │
21
- ▼ ▼ │
22
- ┌────────────┐ ┌────────────┐ ┌───────▼──────┐
23
- │ prem79/ │ │ sentrix_ │ │ app.py │
24
- sentrix_ ├──────────────►│ ML_IA ├──────────────►│ (RoBERTa V2) │
25
- roberta_V2 │ │ (UI) │ │ │
26
- └────────────┘ └────────────┘ └──────────────┘
27
-
28
- 2. Caches model
29
- │ in memory
30
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - es
6
+ - de
7
+ - pt
8
+ license: mit
9
+ tags:
10
+ - sentiment-analysis
11
+ - text-classification
12
+ - roberta
13
+ - twitter
14
+ - nlp
15
+ - fine-tuned
16
+ datasets:
17
+ - tweet_eval
18
+ metrics:
19
+ - accuracy
20
+ - f1
21
+ model-index:
22
+ - name: sentrix_roberta_V2
23
+ results:
24
+ - task:
25
+ type: text-classification
26
+ name: Sentiment Analysis
27
+ metrics:
28
+ - type: accuracy
29
+ value: 0.8821
30
+ - type: f1
31
+ value: 0.8821
32
+ ---
33
+
34
+ # sentrix_roberta_V2
35
+
36
+ A fine-tuned RoBERTa model for binary sentiment classification on social media text. Trained on a balanced Twitter sentiment dataset with 88.2% accuracy on a held-out test set of 40,000 samples.
37
+
38
+ ---
39
+
40
+ ## Model Summary
41
+
42
+ | Property | Value |
43
+ |---|---|
44
+ | Base model | `cardiffnlp/twitter-roberta-base-sentiment-latest` |
45
+ | Architecture | RoBERTa-base |
46
+ | Task | Binary Sentiment Classification |
47
+ | Labels | `NEGATIVE` (0), `POSITIVE` (1) |
48
+ | Test Accuracy | **88.21%** |
49
+ | Test F1 | **88.21%** |
50
+ | Training samples | ~80,000 |
51
+ | Test samples | 40,000 (balanced) |
52
+ | Max sequence length | 128 tokens |
53
+ | Framework | PyTorch + HuggingFace Transformers |
54
+
55
+ ---
56
+
57
+ ## Intended Use
58
+
59
+ This model is designed to classify the sentiment of short-form social media text — primarily tweets and product reviews — as either positive or negative.
60
+
61
+ **Suitable for:**
62
+ - Customer review sentiment classification
63
+ - Social media monitoring
64
+ - Product feedback analysis
65
+ - Multilingual sentiment detection (EN, FR, ES, DE, PT)
66
+
67
+ **Not suitable for:**
68
+ - Long-form documents (truncated at 128 tokens)
69
+ - Fine-grained emotion classification (joy, anger, fear, etc.)
70
+ - Neutral/mixed sentiment detection (binary output only)
71
+
72
+ ---
73
+
74
+ ## Training Details
75
+
76
+ ### Base Model
77
+
78
+ Fine-tuned from `cardiffnlp/twitter-roberta-base-sentiment-latest`, which was itself pre-trained on 58M tweets. This domain-specific pretraining gives the model strong priors for informal language, slang, abbreviations, and emoji context.
79
+
80
+ ### Dataset
81
+
82
+ A balanced Twitter sentiment dataset sourced from Kaggle, split as follows:
83
+
84
+ | Split | Samples | NEGATIVE | POSITIVE |
85
+ |---|---|---|---|
86
+ | Train | ~80,000 | 50% | 50% |
87
+ | Validation | 20,000 | 50% | 50% |
88
+ | Test | 40,000 | 20,000 | 20,000 |
89
+
90
+ ### Preprocessing
91
+
92
+ Standard RoBERTa tweet preprocessing was applied:
93
+
94
+ - URLs replaced with the token `http`
95
+ - User mentions replaced with the token `@user`
96
+ - Text truncated to 128 tokens maximum
97
+
98
+ ### Hyperparameters
99
+
100
+ | Parameter | Value |
101
+ |---|---|
102
+ | Optimizer | AdamW |
103
+ | Learning rate | Default Trainer schedule |
104
+ | Batch size | Default HuggingFace Trainer |
105
+ | Max epochs | 10 |
106
+ | Early stopping | Best checkpoint saved on validation loss |
107
+ | Evaluation strategy | Per 500 steps |
108
+ | Metric for best model | Accuracy + F1 |
109
+ | Training platform | Kaggle (GPU) |
110
+
111
+ ### Training Progress
112
+
113
+ The model was evaluated every 500 steps. Training loss and validation loss both decreased consistently across the first three epochs:
114
+
115
+ | Step | Train Loss | Val Loss | Accuracy | F1 |
116
+ |---|---|---|---|---|
117
+ | 500 | 0.8806 | 0.8685 | 85.00% | 85.00% |
118
+ | 1000 | 0.8451 | 0.8348 | 86.25% | 86.25% |
119
+ | 2000 | 0.8291 | 0.8075 | 86.84% | 86.83% |
120
+ | 3000 | 0.7788 | 0.7987 | 87.32% | 87.31% |
121
+ | 4000 | 0.7754 | 0.8005 | 87.53% | 87.53% |
122
+ | 5000 | 0.7676 | 0.8098 | 87.59% | 87.58% |
123
+ | 6000 | 0.7356 | 0.7944 | 87.72% | 87.72% |
124
+ | 7000 | 0.7310 | 0.7979 | 87.68% | 87.68% |
125
+ | 8000 | 0.6885 | 0.8235 | 87.74% | 87.74% |
126
+ | 8500 | 0.6905 | 0.8104 | 87.72% | 87.72% |
127
+
128
+ The best checkpoint was saved and used for final evaluation.
129
+
130
+ ---
131
+
132
+ ## Evaluation Results
133
+
134
+ Evaluated on the held-out test set of 40,000 samples (20,000 per class).
135
+
136
+ ### Test Set Metrics
137
+
138
+ | Metric | Value |
139
+ |---|---|
140
+ | Accuracy | **0.8821** |
141
+ | F1 (macro) | **0.8821** |
142
+ | Eval loss | 0.8102 |
143
+ | Samples/second | 287.63 |
144
+
145
+ ### Classification Report
146
+
147
+ ```
148
+ precision recall f1-score support
149
+
150
+ Negative 0.88 0.88 0.88 20,000
151
+ Positive 0.88 0.88 0.88 20,000
152
+
153
+ accuracy 0.88 40,000
154
+ macro avg 0.88 0.88 0.88 40,000
155
+ weighted avg 0.88 0.88 0.88 40,000
156
+ ```
157
+
158
+ The model achieves symmetric performance across both classes, indicating no label bias from the balanced training set.
159
+
160
+ ---
161
+
162
+ ## Usage
163
+
164
+ ### Direct Inference with Pipeline
165
+
166
+ ```python
167
+ from transformers import pipeline
168
+
169
+ classifier = pipeline(
170
+ "text-classification",
171
+ model="prem79/sentrix_roberta_V2"
172
+ )
173
+
174
+ result = classifier("The camera quality on this phone is absolutely stunning")
175
+ print(result)
176
+ # [{'label': 'POSITIVE', 'score': 0.9505}]
177
+ ```
178
+
179
+ ### Manual Inference
180
+
181
+ ```python
182
+ import torch
183
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
184
+ import torch.nn.functional as F
185
+
186
+ model_id = "prem79/sentrix_roberta_V2"
187
+
188
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
189
+ model = AutoModelForSequenceClassification.from_pretrained(model_id)
190
+ model.eval()
191
+
192
+ def predict(text):
193
+ # Preprocess (standard RoBERTa tweet normalization)
194
+ import re
195
+ text = re.sub(r'http\S+', 'http', text)
196
+ text = re.sub(r'@\w+', '@user', text)
197
+
198
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
199
+ with torch.no_grad():
200
+ logits = model(**inputs).logits
201
+ probs = F.softmax(logits, dim=-1)[0]
202
+
203
+ labels = ["NEGATIVE", "POSITIVE"]
204
+ sentiment = labels[probs.argmax().item()]
205
+ return {
206
+ "sentiment": sentiment,
207
+ "negative": round(probs[0].item() * 100, 2),
208
+ "positive": round(probs[1].item() * 100, 2),
209
+ }
210
+
211
+ # Examples
212
+ print(predict("The new phone camera is absolutely stunning at night"))
213
+ # {'sentiment': 'POSITIVE', 'negative': 4.95, 'positive': 95.05}
214
+
215
+ print(predict("Battery is terrible, drains in 2 hours, not worth the price"))
216
+ # {'sentiment': 'NEGATIVE', 'negative': 94.72, 'positive': 5.28}
217
+
218
+ print(predict("Ce produit est incroyable! Très satisfait de la qualité."))
219
+ # {'sentiment': 'POSITIVE', 'negative': 7.18, 'positive': 92.82}
220
+ ```
221
+
222
+ ### Batch Inference
223
+
224
+ ```python
225
+ texts = [
226
+ "Absolutely love this product!",
227
+ "Worst experience I have ever had",
228
+ "This product is okay I guess, nothing special",
229
+ ]
230
+
231
+ inputs = tokenizer(texts, padding=True, truncation=True, max_length=128, return_tensors="pt")
232
+ with torch.no_grad():
233
+ logits = model(**inputs).logits
234
+ probs = F.softmax(logits, dim=-1)
235
+
236
+ for text, prob in zip(texts, probs):
237
+ label = "POSITIVE" if prob[1] > prob[0] else "NEGATIVE"
238
+ print(f"{label} ({prob[1].item():.2%} pos) | {text}")
239
+ ```
240
+
241
+ ---
242
+
243
+ ## Live Demo
244
+
245
+ This model powers the SENTRIX sentiment analysis web application:
246
+
247
+ - Frontend: https://prem-479.github.io/sentrix_ML_IA/
248
+ - Source: https://github.com/prem-479/sentrix_ML_IA
249
+
250
+ The application demonstrates:
251
+ - Real-time sentiment classification
252
+ - Aspect extraction from product reviews
253
+ - Multilingual input handling (EN, FR, ES, DE, PT)
254
+ - Emoji signal detection
255
+ - Confidence score visualization
256
+
257
+ ---
258
+
259
+ ## Limitations
260
+
261
+ - **Binary only** — outputs NEGATIVE or POSITIVE only. Sarcasm and neutral/mixed sentiment are classified as one or the other based on dominant signal.
262
+ - **Short text optimized** — trained on tweets (short text). Performance may degrade on long documents due to the 128-token truncation limit.
263
+ - **Sarcasm** — the model does not detect sarcasm. "Oh great, another broken product" will likely be classified as POSITIVE.
264
+ - **Multilingual** — the base model has some cross-lingual capability from Twitter pretraining, but was fine-tuned primarily on English data. Non-English accuracy is lower than English accuracy.
265
+ - **Domain shift** — trained on Twitter/product review data. Performance on other domains (news, medical, legal) has not been evaluated.
266
+
267
+ ---
268
+
269
+ ## Citation
270
+
271
+ If you use this model, please cite the base model:
272
+
273
+ ```bibtex
274
+ @inproceedings{barbieri-etal-2020-tweeteval,
275
+ title = "{T}weet{E}val: Unified Benchmark and Comparative Evaluation for Tweet Classification",
276
+ author = "Barbieri, Francesco and Camacho-Collados, Jose and Espinosa Anke, Luis and Neves, Leonardo",
277
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
278
+ year = "2020",
279
+ publisher = "Association for Computational Linguistics",
280
+ }
281
+ ```
282
+
283
+ ---
284
+
285
+ ## Model Files
286
+
287
+ | File | Description |
288
+ |---|---|
289
+ | `config.json` | Model architecture and label mapping |
290
+ | `model.safetensors` | Model weights (499 MB) |
291
+ | `tokenizer.json` | Tokenizer vocabulary |
292
+ | `tokenizer_config.json` | Tokenizer configuration |
293
+
294
+ ---
295
+
296
+ *Fine-tuned on Kaggle using GPU acceleration. Trained with HuggingFace Transformers and PyTorch.*