Ace-119 commited on
Commit
c376b67
Β·
1 Parent(s): 0304d75

Add Docker deployment config for HuggingFace Spaces

Browse files

- Dockerfile: python:3.11-slim, downloads model.pt at build time
- supervisord.conf: FastAPI (:8000) + Streamlit (:7860) in one container
- requirements.txt: runtime-only, dropped sklearn/matplotlib/seaborn
- requirements-train.txt: extends with training + test deps
- README.md: HF Spaces metadata block + architecture diagram

Files changed (5) hide show
  1. Dockerfile +54 -0
  2. README.md +57 -557
  3. requirements-train.txt +17 -0
  4. requirements.txt +16 -24
  5. supervisord.conf +40 -0
Dockerfile ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ── Base image ──────────────────────────────────────────────────────────────
2
+ # Python 3.11 slim keeps the image lean while matching the dev environment.
3
+ FROM python:3.11-slim
4
+
5
+ # ── System dependencies ──────────────────────────────────────────────────────
6
+ RUN apt-get update && apt-get install -y --no-install-recommends \
7
+ build-essential \
8
+ curl \
9
+ git \
10
+ && rm -rf /var/lib/apt/lists/*
11
+
12
+ # ── Working directory ────────────────────────────────────────────────────────
13
+ WORKDIR /app
14
+
15
+ # ── Python dependencies ──────────────────────────────────────────────────────
16
+ # Copy requirements first so Docker caches this layer unless deps change.
17
+ COPY requirements.txt .
18
+ RUN pip install --no-cache-dir --upgrade pip \
19
+ && pip install --no-cache-dir -r requirements.txt
20
+
21
+ # ── Application code ─────────────────────────────────────────────────────────
22
+ COPY . .
23
+
24
+ # ── Download model checkpoint from HuggingFace Hub ───────────────────────────
25
+ # Runs at build time so the container starts instantly (no cold-download delay).
26
+ # If HF_TOKEN is set as a build secret the download works for private repos too.
27
+ RUN python scripts/download_model.py
28
+
29
+ # ── Environment ──────────────────────────────────────────────────────────────
30
+ # Tell the Streamlit UI where to find the FastAPI backend (same container).
31
+ ENV API_URL=http://localhost:8000
32
+ # Streamlit must listen on 0.0.0.0:7860 for Spaces to route traffic correctly.
33
+ ENV STREAMLIT_SERVER_PORT=7860
34
+ ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
35
+ # Disable Streamlit's browser-open behaviour (no browser inside a container).
36
+ ENV STREAMLIT_BROWSER_GATHER_USAGE_STATS=false
37
+ # Keep Python output unbuffered so logs appear in real time.
38
+ ENV PYTHONUNBUFFERED=1
39
+ # Use /app/data as the SQLite database location (writable inside the container).
40
+ ENV STRESS_DB_PATH=/app/stress_detection.db
41
+
42
+ # ── Expose Streamlit port ────────────────────────────────────────────────────
43
+ # HuggingFace Spaces routes all external traffic to port 7860.
44
+ EXPOSE 7860
45
+
46
+ # ── Process manager ──────────────────────────────────────────────────────────
47
+ # supervisord runs FastAPI (port 8000) and Streamlit (port 7860) as two
48
+ # supervised processes inside one container β€” the standard pattern for
49
+ # HuggingFace Spaces Docker deployments that need a backend + frontend.
50
+ RUN pip install --no-cache-dir supervisor
51
+
52
+ COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
53
+
54
+ CMD ["/usr/local/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]
README.md CHANGED
@@ -1,578 +1,78 @@
1
- # StressDetection
2
-
3
- Resilience-First AI System for Cross-Platform Stress Detection.
4
-
5
- ## Project Structure
6
-
7
- ```
8
- StressDetection/
9
- β”œβ”€β”€ data/
10
- β”‚ β”œβ”€β”€ raw/ # Place raw dataset files here
11
- β”‚ β”œβ”€β”€ eval/ # Evaluation datasets (happy_neutral_eval.csv)
12
- β”‚ └── processed/ # Preprocessed unified CSV output
13
- β”œβ”€β”€ models/ # Model architecture definitions
14
- β”‚ └── saved_models/ # Trained model checkpoints
15
- β”œβ”€β”€ notebooks/ # Google Colab training notebook
16
- β”œβ”€β”€ training/ # Training scripts and configs
17
- β”œβ”€β”€ api/ # Inference server
18
- β”œβ”€β”€ ui/ # Streamlit UI application
19
- β”œβ”€β”€ database/ # SQLite database manager
20
- β”œβ”€β”€ security/ # Auth, JWT, encryption modules
21
- β”œβ”€β”€ intervention/ # Recommendation engine, temporal model
22
- β”œβ”€β”€ utils/ # Shared utilities
23
- β”œβ”€β”€ tests/ # Unit and integration tests
24
- β”œβ”€β”€ data_preprocessing.py # Multi-dataset merge script
25
- β”œβ”€β”€ requirements.txt # Python dependencies
26
- β”œβ”€β”€ setup_environment.sh # Environment setup script (Linux/macOS)
27
- └── run_windows.bat # One-click launcher for Windows 11
28
- ```
29
-
30
  ---
31
-
32
- ## Google Colab β€” Complete Step-by-Step Guide
33
-
34
- > **What you'll have at the end:** a trained stress-detection model running as a
35
- > live web application accessible via a public URL β€” all from inside Colab,
36
- > no local machine required.
37
-
38
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/anant-925/StressDetection/blob/main/notebooks/stress_detection_colab.ipynb)
39
-
40
- ### Before You Start
41
-
42
- 1. **Switch to a GPU runtime** β€” free T4 GPU cuts training time from hours to minutes.
43
- In Colab: **Runtime β†’ Change runtime type β†’ T4 GPU β†’ Save**.
44
-
45
- 2. **Get a free ngrok account** β€” needed only if you want to run the live web UI
46
- from Colab (optional, Steps 8–10).
47
- Sign up free at <https://dashboard.ngrok.com/signup>, then copy your
48
- **Authtoken** from <https://dashboard.ngrok.com/get-started/your-authtoken>.
49
-
50
  ---
51
 
52
- ### Cell 1 β€” Mount Google Drive & Clone the Repository
53
-
54
- > Everything is saved to Google Drive so your work survives session disconnects.
55
-
56
- ```python
57
- # ── Cell 1 ──────────────────────────────────────────────────────────────────
58
- from google.colab import drive
59
- drive.mount('/content/drive')
60
-
61
- import os
62
-
63
- # Persistent directories on Google Drive
64
- DRIVE_BASE = '/content/drive/MyDrive/StressDetection'
65
- for d in ['data/raw', 'data/processed', 'checkpoints', 'logs']:
66
- os.makedirs(os.path.join(DRIVE_BASE, d), exist_ok=True)
67
-
68
- # Clone the repository (skips if already cloned)
69
- if not os.path.isdir('/content/StressDetection'):
70
- !git clone https://github.com/anant-925/StressDetection.git /content/StressDetection
71
-
72
- %cd /content/StressDetection
73
- print("Ready β€” working directory:", os.getcwd())
74
- ```
75
 
76
- ---
 
77
 
78
- ### Cell 2 β€” Install Dependencies
79
 
80
- ```python
81
- # ── Cell 2 ──────────────────────────────────────────────────────────────────
82
- !pip install -q -r requirements.txt
83
- print("All dependencies installed.")
84
  ```
85
-
86
- > This takes about 2 minutes on the first run. Subsequent runs are faster
87
- > because Colab caches packages within the same session.
88
-
89
- ---
90
-
91
- ### Cell 3 β€” Upload Raw Datasets (saved to Google Drive)
92
-
93
- Upload your dataset files **once**. They stay on Google Drive forever.
94
-
95
- ```python
96
- # ── Cell 3 ──────────────────────────────────────────────────────────────────
97
- import shutil, os
98
- from google.colab import files
99
-
100
- DRIVE_RAW = '/content/drive/MyDrive/StressDetection/data/raw'
101
- LOCAL_RAW = '/content/StressDetection/data/raw'
102
- os.makedirs(LOCAL_RAW, exist_ok=True)
103
-
104
- # ── Upload files from your computer ──
105
- print("Select one or more dataset files to upload:")
106
- uploaded = files.upload() # Opens a file-picker dialog
107
- for filename in uploaded:
108
- # Save to Google Drive (persists across sessions)
109
- shutil.move(filename, os.path.join(DRIVE_RAW, filename))
110
- print(f" Saved to Drive: {filename}")
111
-
112
- # ── Sync Drive β†’ local workspace ──
113
- for f in os.listdir(DRIVE_RAW):
114
- shutil.copy2(os.path.join(DRIVE_RAW, f), os.path.join(LOCAL_RAW, f))
115
- print(f"\nDatasets available locally in {LOCAL_RAW}:")
116
- print('\n'.join(f' {f}' for f in os.listdir(LOCAL_RAW)))
117
  ```
118
 
119
- **Required / optional dataset files** (place in `data/raw/`):
120
 
121
- | File | Domain | Notes |
122
- |------|--------|-------|
123
- | `dreaddit-train.csv` (or `.csv.zip`) | Reddit Long | |
124
- | `Reddit_Combi.csv` (or `.xlsx`) | Reddit Long | |
125
- | `Reddit_Title.csv` (or `.xlsx`) | Reddit Short | |
126
- | `Twitter_Full.csv` (or `.xlsx`) | Twitter Short | |
127
- | `Stressed_Tweets.csv` | Twitter Short | implicit label = 1 |
128
- | `Happy_Neutral.csv` | Optional negatives | implicit label = 0 |
129
 
130
- > **Already uploaded before?** Skip the `files.upload()` call β€” the second
131
- > block (Drive β†’ local sync) is enough to restore them.
132
-
133
- ---
134
-
135
- ### Cell 4 β€” Preprocess Data
136
-
137
- ```python
138
- # ── Cell 4 ──────────────────────────────────────────────────────────────────
139
- # Run only if the processed CSV isn't already on Drive
140
- PROCESSED_DRIVE = '/content/drive/MyDrive/StressDetection/data/processed/unified_stress.csv'
141
- PROCESSED_LOCAL = 'data/processed/unified_stress.csv'
142
-
143
- if os.path.isfile(PROCESSED_DRIVE):
144
- # Restore from Drive (fast β€” no reprocessing needed)
145
- os.makedirs('data/processed', exist_ok=True)
146
- shutil.copy2(PROCESSED_DRIVE, PROCESSED_LOCAL)
147
- print(f"Restored processed data from Drive: {PROCESSED_LOCAL}")
148
- else:
149
- # First run β€” build the unified CSV from raw files
150
- !python data_preprocessing.py
151
- shutil.copy2(PROCESSED_LOCAL, PROCESSED_DRIVE)
152
- print("Processed data saved to Google Drive.")
153
-
154
- import pandas as pd
155
- df = pd.read_csv(PROCESSED_LOCAL)
156
- print(f"\nRows: {len(df):,} | Label balance:")
157
- print(df['label'].value_counts().to_string())
158
- ```
159
-
160
- The script produces `data/processed/unified_stress.csv` with columns
161
- `text`, `label`, `domain` plus any numeric features from Dreaddit.
162
-
163
- ---
164
-
165
- ### Cell 5 β€” Train the Model
166
-
167
- ```python
168
- # ── Cell 5 ──────────────────────────────────────────────────────────────────
169
- # Recommended: CNN is fastest and works well out of the box.
170
- # Change --model to 'deberta' or 'minilm' for transformer-based models
171
- # (requires more VRAM and ~3Γ— longer training time).
172
-
173
- !python training/train.py \
174
- --model cnn \
175
- --epochs 15 \
176
- --batch-size 64 \
177
- --lr 1e-3 \
178
- --weight-decay 1e-4 \
179
- --dropout 0.3 \
180
- --label-smoothing 0.1 \
181
- --class-weighted \
182
- --patience 3 \
183
- --device cuda \
184
- --data data/processed/unified_stress.csv \
185
- --eval-set data/eval/happy_neutral_eval.csv \
186
- --output checkpoints/model.pt
187
- ```
188
-
189
- **All training flags:**
190
-
191
- | Flag | Default | Description |
192
- |------|---------|-------------|
193
- | `--model` | `cnn` | `cnn`, `deberta`, or `minilm` |
194
- | `--epochs` | `10` | Max training epochs |
195
- | `--batch-size` | `64` | Mini-batch size |
196
- | `--lr` | `1e-3` | Learning rate (CNN); transformers use `2e-5` internally |
197
- | `--weight-decay` | `1e-4` | AdamW weight decay |
198
- | `--dropout` | `0.3` | Dropout rate |
199
- | `--label-smoothing` | `0.0` | Label smoothing for cross-entropy |
200
- | `--class-weighted` | off | Use inverse-frequency class weights (recommended for imbalanced data) |
201
- | `--patience` | `3` | Early-stopping patience (epochs without F1 improvement) |
202
- | `--device` | `cuda` | `cuda` or `cpu` |
203
- | `--data` | `data/processed/unified_stress.csv` | Preprocessed CSV |
204
- | `--output` | `checkpoints/model.pt` | Checkpoint save path |
205
- | `--eval-set` | `data/eval/happy_neutral_eval.csv` | Fixed evaluation set for false-positive monitoring |
206
- | `--max-length` | `256` | Transformer max token length |
207
-
208
- Training prints a live progress bar and saves the **best checkpoint by validation F1**
209
- to `checkpoints/model.pt`. The threshold calibrated during training is embedded
210
- in the checkpoint and loaded automatically by the API.
211
-
212
- ---
213
-
214
- ### Cell 6 β€” Save Checkpoint to Google Drive & Download
215
-
216
- ```python
217
- # ── Cell 6 ──────────────────────────────────────────────────────────────────
218
- import shutil
219
- from google.colab import files
220
-
221
- CKPT_LOCAL = 'checkpoints/model.pt'
222
- CKPT_DRIVE = '/content/drive/MyDrive/StressDetection/checkpoints/model.pt'
223
-
224
- # Persist to Google Drive (survives session resets)
225
- shutil.copy2(CKPT_LOCAL, CKPT_DRIVE)
226
- print(f"Checkpoint saved to Google Drive: {CKPT_DRIVE}")
227
-
228
- # Download to your computer (for Windows / local deployment)
229
- files.download(CKPT_LOCAL)
230
- print("Download started β€” check your browser's downloads folder.")
231
- ```
232
-
233
- > **That's all you need to transfer:** a single `model.pt` file.
234
- > No vocabulary file is required β€” the tokenizer is hash-based and
235
- > fully deterministic across all platforms.
236
-
237
- ---
238
-
239
- ### Cell 7 β€” Quick Inference Test (no server needed)
240
-
241
- Verify the model works before spinning up the full application:
242
-
243
- ```python
244
- # ── Cell 7 ──────────────────────────────────────────────────────────────────
245
- import sys, hashlib, torch
246
- sys.path.insert(0, '/content/StressDetection')
247
-
248
- from models.architecture import OptimizedMultichannelCNN
249
-
250
- VOCAB_SIZE = 10_000
251
- MAX_LEN = 256
252
- CKPT_LOCAL = 'checkpoints/model.pt'
253
-
254
- # ── Load checkpoint ──
255
- checkpoint = torch.load(CKPT_LOCAL, map_location='cpu', weights_only=True)
256
- model_type = checkpoint.get('model_type', 'cnn')
257
- threshold = float(checkpoint.get('decision_threshold', 0.5))
258
-
259
- if model_type != 'cnn':
260
- print(f"Model type is '{model_type}'. Use the transformer path below.")
261
- else:
262
- dropout = float(checkpoint.get('dropout', 0.3))
263
- feature_dim = int(checkpoint.get('feature_dim', 0))
264
- model = OptimizedMultichannelCNN(
265
- vocab_size=VOCAB_SIZE, embed_dim=128, num_filters=64,
266
- kernel_sizes=(2, 3, 5), num_classes=2,
267
- dropout=dropout, aux_dim=feature_dim,
268
- )
269
- model.load_state_dict(checkpoint['model_state_dict'])
270
- model.eval()
271
-
272
- # ── Simple hash-based tokenizer (matches API behaviour exactly) ──
273
- def tokenize(text, vocab_size=VOCAB_SIZE, max_len=MAX_LEN):
274
- tokens = text.lower().split()[:max_len]
275
- ids = [
276
- int(hashlib.md5(t.encode()).hexdigest(), 16) % vocab_size
277
- for t in tokens
278
- ]
279
- if len(ids) < max_len:
280
- ids += [0] * (max_len - len(ids))
281
- return ids
282
-
283
- # ── Run inference ──
284
- test_sentences = [
285
- "I can't sleep, my mind won't stop racing",
286
- "Had an amazing day with family, feeling blessed",
287
- "Overwhelmed with deadlines, barely keeping up",
288
- "Just finished a great workout, feeling strong",
289
- "Everything is going wrong and I don't know what to do",
290
- ]
291
-
292
- print(f"Decision threshold: {threshold:.3f}\n")
293
- print(f"{'Text':<55} {'Score':>6} {'Label'}")
294
- print('-' * 72)
295
- for text in test_sentences:
296
- ids = tokenize(text)
297
- tensor = torch.tensor([ids], dtype=torch.long)
298
- with torch.no_grad():
299
- out = model(tensor)
300
- prob = float(torch.softmax(out['logits'], dim=-1)[0, 1])
301
- label = 'STRESS' if prob >= threshold else 'no stress'
302
- print(f"{text[:54]:<55} {prob:>6.3f} {label}")
303
- ```
304
-
305
- Expected output β€” the model should clearly distinguish stressed from calm text.
306
-
307
- ---
308
-
309
- ### Cells 8–10 β€” Run the Full Application from Colab (Optional)
310
-
311
- > These cells start the FastAPI backend and Streamlit UI inside Colab and
312
- > expose them via **ngrok** public URLs so you can open the dashboard in
313
- > any browser. Requires a **free ngrok account** (see *Before You Start*).
314
-
315
- #### Cell 8 β€” Authenticate ngrok
316
-
317
- ```python
318
- # ── Cell 8 ──────────────────────────────────────────────────────────────────
319
- !pip install -q pyngrok
320
-
321
- from pyngrok import ngrok
322
-
323
- # Paste your authtoken from https://dashboard.ngrok.com/get-started/your-authtoken
324
- NGROK_TOKEN = "PASTE_YOUR_NGROK_AUTHTOKEN_HERE"
325
- ngrok.set_auth_token(NGROK_TOKEN)
326
- print("ngrok authenticated.")
327
- ```
328
 
329
- #### Cell 9 β€” Start the FastAPI Backend
 
 
 
330
 
331
- ```python
332
- # ── Cell 9 ──────────────────────────────────────────────────────────────────
333
- import subprocess, time, os
334
 
335
- os.makedirs('checkpoints', exist_ok=True)
 
 
 
 
 
 
 
 
336
 
337
- # Set security keys (auto-generated dev keys are fine for personal testing)
338
- os.environ.setdefault('JWT_SECRET_KEY', 'colab-dev-secret-key-change-for-production')
339
-
340
- # Generate and set a Fernet key if not already set
341
- if 'FERNET_KEY' not in os.environ:
342
- from cryptography.fernet import Fernet
343
- os.environ['FERNET_KEY'] = Fernet.generate_key().decode()
344
-
345
- # Start FastAPI in the background
346
- api_proc = subprocess.Popen(
347
- ['uvicorn', 'api.main:app', '--host', '0.0.0.0', '--port', '8000'],
348
- stdout=subprocess.PIPE, stderr=subprocess.STDOUT
349
- )
350
- time.sleep(4) # Wait for the server to finish loading
351
-
352
- # Create a public ngrok tunnel to port 8000
353
- api_tunnel = ngrok.connect(8000)
354
- API_URL = str(api_tunnel.public_url)
355
- print(f"FastAPI is live at: {API_URL}")
356
- print(f"Interactive API docs: {API_URL}/docs")
357
- ```
358
-
359
- > Open **`{API_URL}/docs`** in your browser to explore all endpoints
360
- > interactively (register, login, analyze) with the built-in Swagger UI.
361
-
362
- #### Cell 10 β€” Start the Streamlit UI
363
-
364
- ```python
365
- # ── Cell 10 ─────────────────────────────────────────────────────────────────
366
- import subprocess, time, os
367
-
368
- # Tell the UI where the API lives
369
- os.environ['API_URL'] = API_URL
370
-
371
- # Start Streamlit in the background
372
- ui_proc = subprocess.Popen(
373
- ['streamlit', 'run', 'ui/app.py',
374
- '--server.port', '8501',
375
- '--server.headless', 'true'],
376
- stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
377
- env={**os.environ}
378
- )
379
- time.sleep(6) # Wait for Streamlit to compile
380
-
381
- # Create a public ngrok tunnel to port 8501
382
- ui_tunnel = ngrok.connect(8501)
383
- UI_URL = str(ui_tunnel.public_url)
384
- print(f"Streamlit UI is live at: {UI_URL}")
385
- print("Open the link above in your browser to use the dashboard.")
386
- ```
387
-
388
- > Click the **Streamlit URL**, register an account, and start analysing text.
389
-
390
- #### Stopping the Services
391
-
392
- ```python
393
- # Run this cell when you are done
394
- api_proc.terminate()
395
- ui_proc.terminate()
396
- ngrok.kill()
397
- print("All services stopped.")
398
- ```
399
-
400
- ---
401
-
402
- ### Reconnecting After a Disconnect
403
-
404
- If your Colab session timed out or you closed the browser tab:
405
-
406
- | Cell | Action | Skip if… |
407
- |------|--------|---------|
408
- | **Cell 1** | Mount Drive + clone repo | β€” always re-run |
409
- | **Cell 2** | Install dependencies | β€” always re-run |
410
- | **Cell 3** | Upload datasets | Files already in Drive β€” run only the sync block |
411
- | **Cell 4** | Preprocess data | Processed CSV is on Drive β€” the cell auto-restores it |
412
- | **Cell 5** | Train | Checkpoint already on Drive β€” skip, go to Cell 6 |
413
- | **Cell 6** | Save checkpoint | Restore from Drive: `shutil.copy2(CKPT_DRIVE, CKPT_LOCAL)` |
414
- | **Cells 7–10** | Test / run app | β€” re-run as needed |
415
-
416
- **Your datasets, processed CSV, and trained checkpoint are all safe on Google Drive.**
417
-
418
- ---
419
-
420
- ### Quick-Reference Table
421
-
422
- | Step | Where | Command / Action |
423
- |------|-------|-----------------|
424
- | 1 | Colab | Switch runtime to T4 GPU |
425
- | 2 | Colab | Cell 1 β€” mount Drive, clone repo |
426
- | 3 | Colab | Cell 2 β€” install dependencies |
427
- | 4 | Colab | Cell 3 β€” upload datasets to Drive |
428
- | 5 | Colab | Cell 4 β€” preprocess β†’ `unified_stress.csv` |
429
- | 6 | Colab | Cell 5 β€” train the model |
430
- | 7 | Colab | Cell 6 β€” save `model.pt` to Drive; download to PC |
431
- | 8 | Colab | Cell 7 β€” quick inference test (no server needed) |
432
- | 9 | Colab | Cells 8–10 β€” live app via ngrok *(optional)* |
433
- | 10 | Windows | Download `model.pt`; run `run_windows.bat` |
434
-
435
- ---
436
-
437
- ## Part B β€” Running the Application on Windows 11
438
-
439
- #### Quick Start (One-Click Launcher)
440
-
441
- After cloning the repo and placing `model.pt`, simply double-click:
442
-
443
- ```
444
- run_windows.bat
445
- ```
446
-
447
- This automatically creates the virtual environment, installs dependencies,
448
- starts both the FastAPI backend and Streamlit UI in separate windows, and
449
- opens your browser to the dashboard.
450
-
451
- #### Manual Setup
452
-
453
- ##### Step 1 β€” Clone the Repository
454
-
455
- ```cmd
456
- git clone https://github.com/anant-925/StressDetection.git
457
- cd StressDetection
458
- ```
459
-
460
- ##### Step 2 β€” Set Up the Python Environment
461
-
462
- ```cmd
463
- python -m venv venv
464
- venv\Scripts\activate
465
- pip install --upgrade pip
466
- pip install -r requirements.txt
467
- ```
468
-
469
- > Use **`venv` + `pip`** β€” do not use Conda.
470
- > Requires **Python 3.10 or newer**.
471
-
472
- ##### Step 3 β€” Place the Trained Checkpoint
473
-
474
- Download `model.pt` from Google Drive (`MyDrive/StressDetection/checkpoints/model.pt`)
475
- or from your Colab download:
476
-
477
- ```cmd
478
- mkdir checkpoints
479
- copy %USERPROFILE%\Downloads\model.pt checkpoints\model.pt
480
- ```
481
-
482
- The API server looks for `checkpoints/model.pt` by default.
483
- Override via the `STRESS_MODEL_CHECKPOINT` environment variable if needed.
484
-
485
- ##### Step 4 β€” Set Security Keys *(optional for local use)*
486
-
487
- ```cmd
488
- set JWT_SECRET_KEY=your-random-secret-key-here
489
- set FERNET_KEY=your-base64-fernet-key-here
490
- ```
491
-
492
- Generate a Fernet key (requires Step 2 β€” `pip install -r requirements.txt` β€” to be done first):
493
-
494
- ```cmd
495
- python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
496
- ```
497
-
498
- If omitted, auto-generated development keys are used (fine for local testing,
499
- not recommended for production).
500
-
501
- ##### Step 5 β€” Start the FastAPI Backend *(Terminal 1)*
502
-
503
- ```cmd
504
- uvicorn api.main:app --host 0.0.0.0 --port 8000
505
- ```
506
-
507
- The model checkpoint is loaded on the first `/analyze` request.
508
- API docs are available at <http://localhost:8000/docs>.
509
-
510
- ##### Step 6 β€” Start the Streamlit UI *(Terminal 2)*
511
-
512
- ```cmd
513
- venv\Scripts\activate
514
- streamlit run ui/app.py
515
- ```
516
-
517
- Opens the dashboard at <http://localhost:8501>.
518
- The UI connects to the FastAPI backend at `http://localhost:8000` by default
519
- (override with the `API_URL` environment variable).
520
-
521
- ##### Step 7 β€” Use the Application
522
-
523
- 1. Open <http://localhost:8501> in your browser.
524
- 2. **Register** a new account (username β‰₯ 3 chars, password β‰₯ 8 chars).
525
- 3. Type text describing how you're feeling and click **Analyse**.
526
- 4. Results include:
527
- - **Stress score** (0–100%)
528
- - **Stress velocity** (trend direction over your history)
529
- - **Attention heatmap** (which words influenced the prediction)
530
- - **Recommended interventions** (breathing, grounding, cognitive reframes)
531
- - **Crisis detection** β€” if crisis language is detected the app shows
532
- the 988 Suicide & Crisis Lifeline and stops further processing.
533
- 5. Your stress history chart grows with each analysis during the session.
534
-
535
- ---
536
-
537
- ## Quick Start (local Linux/macOS)
538
 
539
  ```bash
540
- # 1. Set up the environment
541
- bash setup_environment.sh
542
-
543
- # 2. Place your dataset files in data/raw/
544
- # 3. Run preprocessing
545
- python data_preprocessing.py
546
-
547
- # 4. Train
548
- python training/train.py
549
-
550
- # 5. Start API + UI
551
- uvicorn api.main:app --host 0.0.0.0 --port 8000 &
552
- streamlit run ui/app.py
553
  ```
554
 
555
- ## Datasets
556
-
557
- Place these files in `data/raw/`:
558
-
559
- | File | Domain |
560
- |------|--------|
561
- | `dreaddit-train.csv` (or `.csv.zip`) | Reddit Long |
562
- | `Reddit_Combi.csv` (or `.xlsx`) | Reddit Long |
563
- | `Reddit_Title.csv` (or `.xlsx`) | Reddit Short |
564
- | `Twitter_Full.csv` (or `.xlsx`) | Twitter Short |
565
- | `Stressed_Tweets.csv` | Twitter Short (implicit label=1) |
566
- | `Happy_Neutral.csv` | Optional negatives (implicit label=0) |
567
 
568
- ### Happy/Neutral Evaluation Set
569
 
570
- The repository includes `data/eval/happy_neutral_eval.csv`, a small fixed
571
- set of happy/neutral sentences used to monitor false positives during training.
572
- You can extend it with your own examples (keep the `text,label` columns).
573
-
574
- ## Testing
575
-
576
- ```bash
577
- python -m pytest tests/ -v
578
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: StressDetect
3
+ emoji: 🧠
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
+ # 🧠 StressDetect
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
+ A full-stack mental health application for real-time stress detection with
14
+ personalized interventions β€” built as an end-to-end ML systems project.
15
 
16
+ ## Architecture
17
 
 
 
 
 
18
  ```
19
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
20
+ β”‚ Streamlit UI β”‚ ← port 7860 (public)
21
+ β”‚ Dashboard Β· History & Analytics Β· Settings β”‚
22
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
23
+ β”‚ REST (localhost)
24
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
25
+ β”‚ FastAPI Backend β”‚ ← port 8000 (internal)
26
+ β”‚ /analyze Β· /history Β· /feedback Β· /personalize β”‚
27
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
28
+ β”‚
29
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
30
+ β–Ό β–Ό β–Ό
31
+ CNN Model SQLite DB Intervention
32
+ (MC-Dropout) (encrypted) Engine
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ```
34
 
35
+ ## Model
36
 
37
+ - **Multichannel 1D CNN** with multi-head self-attention (4 heads)
38
+ - **MC-Dropout ensemble** (3 passes) for uncertainty estimation
39
+ - **FPR-constrained threshold calibration** (max FPR 20%)
40
+ - **Focal loss** + cosine LR warmup + early stopping
41
+ - Checkpoint: [`Ace-119/stress-detection-cnn`](https://huggingface.co/Ace-119/stress-detection-cnn)
 
 
 
42
 
43
+ ## Safety
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
+ - **4-layer intervention engine** with 988 crisis circuit breaker
46
+ - Crisis keywords (suicide/self-harm) β†’ immediate 988 lifeline, pipeline halts
47
+ - 8 trigger categories: sleep, work, exam, money, relationship, health, grief, loneliness
48
+ - Escalation tracker: 3+ consecutive high-stress sessions β†’ professional referral
49
 
50
+ ## Features
 
 
51
 
52
+ | Feature | Detail |
53
+ |---|---|
54
+ | Stress scoring | CNN probability + MC-Dropout uncertainty |
55
+ | Temporal profiling | Adaptive threshold, velocity, volatility |
56
+ | Interventions | Progressive step-by-step guided flow |
57
+ | RL feedback loop | User + LLM-as-judge reward signal |
58
+ | Personalization | Per-user score bias from feedback history |
59
+ | Analytics | Timeline, calendar heatmap, polar chart, trigger frequency |
60
+ | Security | JWT auth, bcrypt passwords, AES-256 history encryption |
61
 
62
+ ## Quick Start (local)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
  ```bash
65
+ git clone https://github.com/Ace-119/StressDetection
66
+ cd StressDetection
67
+ pip install -r requirements-train.txt # full deps including training
68
+ python scripts/download_model.py # pulls model.pt from HF Hub
69
+ make dev # FastAPI :8000 + Streamlit :8501
 
 
 
 
 
 
 
 
70
  ```
71
 
72
+ ## Crisis Resources
 
 
 
 
 
 
 
 
 
 
 
73
 
74
+ This app always surfaces crisis resources when needed.
75
 
76
+ **988 Suicide & Crisis Lifeline** β€” Call or text **988** (US)
77
+ **Crisis Text Line** β€” Text HOME to **741741**
78
+ **SAMHSA Helpline** β€” 1-800-662-4357 (free, confidential, 24/7)
 
 
 
 
 
 
requirements-train.txt ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Training dependencies β€” extends requirements.txt
2
+ # Usage:
3
+ # pip install -r requirements-train.txt
4
+ #
5
+ # Colab:
6
+ # !pip install -r requirements-train.txt
7
+
8
+ -r requirements.txt
9
+
10
+ # ── Training only ────────────────────────────────────────────────────────────
11
+ scikit-learn>=1.3.0
12
+ matplotlib>=3.7.0
13
+ seaborn>=0.12.0
14
+
15
+ # ── Testing ──────────────────────────────────────────────────────────────────
16
+ pytest>=7.4.0
17
+ httpx>=0.25.0
requirements.txt CHANGED
@@ -1,35 +1,27 @@
1
- # Resilience-First AI System for Cross-Platform Stress Detection
2
- # Requirements - Full Stack (Phases 1-4)
3
-
4
- # Core ML
5
  torch>=2.0.0
6
  transformers>=4.30.0
7
-
8
- # Data Processing
9
- pandas>=2.0.0
10
- numpy>=1.24.0
11
- openpyxl>=3.1.0
12
-
13
- # Security & Authentication
14
- bcrypt>=4.0.0
15
- python-jose[cryptography]>=3.4.0
16
- cryptography>=46.0.5
17
-
18
- # API
19
  fastapi>=0.115.0
20
  uvicorn[standard]>=0.27.0
21
  pydantic>=2.0.0
 
 
 
 
 
 
 
22
 
23
- # UI
24
  streamlit>=1.28.0
25
  plotly>=5.18.0
26
 
27
- # Utilities
28
- scikit-learn>=1.3.0
29
- matplotlib>=3.7.0
30
- seaborn>=0.12.0
31
  nltk>=3.8.0
32
 
33
- # Testing
34
- pytest>=7.4.0
35
- httpx>=0.25.0
 
1
+ # ── Inference & API ──────────────────────────────────────────────────────────
 
 
 
2
  torch>=2.0.0
3
  transformers>=4.30.0
 
 
 
 
 
 
 
 
 
 
 
 
4
  fastapi>=0.115.0
5
  uvicorn[standard]>=0.27.0
6
  pydantic>=2.0.0
7
+ huggingface-hub>=0.23.0
8
+ requests>=2.32.0
9
+
10
+ # ── Security ─────────────────────────────────────────────────────────────────
11
+ bcrypt>=4.0.0
12
+ python-jose[cryptography]>=3.4.0
13
+ cryptography>=42.0.7
14
 
15
+ # ── UI ───────────────────────────────────────────────────────────────────────
16
  streamlit>=1.28.0
17
  plotly>=5.18.0
18
 
19
+ # ── Data ─────────────────────────────────────────────────────────────────────
20
+ pandas>=2.0.0
21
+ numpy>=1.24.0
22
+ openpyxl>=3.1.0
23
  nltk>=3.8.0
24
 
25
+ # ── LLM reward (optional β€” set OPENAI_API_KEY or GEMINI_API_KEY to enable) ───
26
+ openai>=1.30.0
27
+ google-generativeai>=0.5.0
supervisord.conf ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [supervisord]
2
+ nodaemon=true
3
+ logfile=/var/log/supervisord.log
4
+ logfile_maxbytes=10MB
5
+ loglevel=info
6
+ pidfile=/var/run/supervisord.pid
7
+
8
+ ; ── FastAPI backend ──────────────────────────────────────────────────────────
9
+ [program:fastapi]
10
+ command=uvicorn api.main:app --host 0.0.0.0 --port 8000 --workers 1 --log-level warning
11
+ directory=/app
12
+ autostart=true
13
+ autorestart=true
14
+ startretries=3
15
+ startsecs=5
16
+ stdout_logfile=/var/log/fastapi.log
17
+ stderr_logfile=/var/log/fastapi_err.log
18
+ stdout_logfile_maxbytes=5MB
19
+ stderr_logfile_maxbytes=5MB
20
+ priority=1
21
+
22
+ ; ── Streamlit UI ─────────────────────────────────────────────────────────────
23
+ ; Starts 5 seconds after FastAPI so the /health endpoint is ready
24
+ ; before the UI tries to call it.
25
+ [program:streamlit]
26
+ command=streamlit run ui/app.py
27
+ --server.port=7860
28
+ --server.address=0.0.0.0
29
+ --server.headless=true
30
+ --browser.gatherUsageStats=false
31
+ directory=/app
32
+ autostart=true
33
+ autorestart=true
34
+ startretries=3
35
+ startsecs=5
36
+ stdout_logfile=/var/log/streamlit.log
37
+ stderr_logfile=/var/log/streamlit_err.log
38
+ stdout_logfile_maxbytes=5MB
39
+ stderr_logfile_maxbytes=5MB
40
+ priority=2