henok007's picture
Upload folder using huggingface_hub
03c6de8 verified

img.png

Multi-Task Learning for Emotional and Cognitive State Detection

Python 3.8+ TensorFlow 2.x License: MIT

Simultaneously predict emotional state (stress) and cognitive state (mental effort) from physiological signals using a unified deep learning architecture.

🎯 Key Innovation: Multi-task framework achieving 70.7% accuracy with multimodal fusion (+8.3% over IBI-only) and principled data preparation (+3.7% over naive training).


Why Multi-Task Learning?

Traditional approaches train separate models for stress and effort detection. Our unified multi-task framework offers:

  • Joint representation learning: Shared encoders extract features relevant to both emotional and cognitive states
  • Efficient architecture: Single model (~87k parameters) predicts both dimensions simultaneously
  • Task-specific training: Principled data preparation (masking ambiguous samples) improves generalization
  • Real-time inference: One forward pass β†’ both stress and effort predictions

Quick Start

# Install
git clone https://github.com/yisakSyn/SynheartFocus_MultitaskModel.git
cd SynheartFocus_MultitaskModel
pip install -r requirements.txt

# Prepare data (SWELL-KW dataset)
python prepare_multimodal_data_v5.py

# Train with LOSO cross-validation
python trainer_v8_masked_effortMultimodal.py --loso --data_dir ./prepared_data_v5_multimodal

# Single holdout evaluation
python trainer_v8_masked_effortMultimodal.py --holdout pp09 pp17 pp25

Architecture

Physiological Inputs (4 streams)
β”œβ”€ IBI timeseries (120 samples @ 2Hz = 60s) ──┐
β”œβ”€ HRV features (14 features)                 β”œβ”€β†’ IBI Encoder (CNN-LSTM) β†’ Dense(32)
β”œβ”€ EDA timeseries (120 samples @ 2Hz = 60s) ───                                 ↓
└─ EDA features (12 features)                 └─→ EDA Encoder (CNN-LSTM) β†’ Dense(32)
                                                                                  ↓
                                                           Fusion: Concat(32+16+32+16) = 96d
                                                                                  ↓
                                                                    Dense(64) β†’ Dense(32)
                                                                                  ↓
                                                           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                                      Stress Head                                   Effort Head
                                                     Dense(2, softmax)                            Dense(2, softmax)
                                                  (trained on all samples)                    (trained on valid only)

Model Details

IBI Encoder (CNN-LSTM):

  • Conv1D blocks: 24β†’32β†’48 filters (kernels 7β†’5β†’3)
  • LSTM(32) with attention pooling
  • HRV feature branch: Dense(24β†’16)
  • Output: 48 dimensions (32+16)

EDA Encoder (CNN-LSTM):

  • Conv1D blocks: 24β†’32β†’48 filters (kernels 9β†’7β†’5, larger for EDA)
  • LSTM(32) with attention pooling
  • EDA feature branch: Dense(24β†’16)
  • Output: 48 dimensions (32+16)

Fusion & Classification:

  • Concatenate: [IBI(32) + HRV(16) + EDA(32) + EDA_feat(16)] = 96 dims
  • Dense layers: 64 β†’ 32 with dropout (0.5, 0.25)
  • Two task-specific output heads (softmax)

Joint Optimization:

# Loss computation
loss_stress = focal_loss(y_true_stress, y_pred_stress)  # All samples
loss_effort = masked_focal_loss(y_true_effort, y_pred_effort, mask)  # Valid only
total_loss = loss_stress + loss_effort

Parameters: ~87,000 total


Results

LOSO Cross-Validation (18 subjects)

Metric Accuracy Std Dev
Stress Detection 68.9% Β±12.9%
Effort Detection 72.5% Β±14.1%
Average 70.7% Β±13.2%

Ablation Studies

1. Multimodal Fusion:

Configuration Stress Effort Average
IBI only (Run 5) 60.2% 64.6% 62.4%
IBI + EDA (ours, Run 3) 68.9% 72.5% 70.7%
Improvement +8.7% +7.9% +8.3%

Key Finding: Multimodal fusion (IBI + EDA) provides substantial improvement over single-modality IBI-only approach.

2. Task-Specific Data Preparation:

Training Strategy Stress Effort Average
All conditions (c1, c2, c3) 68.7% 65.2% 67.0%
Task-specific masking (ours) 68.9% 72.5% 70.7%
Improvement +0.2% +7.3%* +3.7%

*p < 0.01 (McNemar's test)

Key Finding: Masking ambiguous effort labels (c2 interruption condition) improves effort detection by 7.3% while maintaining stress accuracy.

Key Contributions Validated

Our experimental results validate two independent contributions:

  1. Multimodal Fusion (+8.3%): Combining IBI and EDA signals outperforms IBI-only approach
  2. Task-Specific Data Preparation (+3.7%): Masking ambiguous samples improves effort detection
  3. Combined Framework (70.7%): Multi-task architecture with principled data curation

Methodology

Multi-Task Learning Framework

Core Design: Unified architecture with shared encoders and task-specific output heads enables joint optimization of both emotional and cognitive state prediction.

Loss Function:

L_total = L_stress + L_effort

where:
  L_stress = FocalLoss(y_stress, pred_stress)  # All samples
  L_effort = MaskedFocalLoss(y_effort, pred_effort, mask)  # Valid samples only

Task-Specific Data Preparation

Condition Stress Label Effort Label Training Strategy
c1 (Neutral) 0 (Low) 0 (Low) Both tasks trained
c2 (Interruption) 1 (High) MASKED Stress only
c3 (Time Pressure) 1 (High) 1 (High) Both tasks trained

Rationale: Interruption-based tasks (c2) produce ambiguous effort labels due to individual differences in multitasking ability. NASA-TLX analysis shows 2.2Γ— higher variance in effort ratings for c2 (CV=0.44) vs. c3 (CV=0.20).

Implementation:

  • Stress head: Trained on all 11,332 samples
  • Effort head: Trained on 7,998 valid samples (c1 + c3 only)
  • Masking applied during loss computation, not data filtering

Signal Processing

IBI Extraction:

  1. R-peak detection from ECG (2048 Hz)
  2. Artifact correction (physiological constraints: 300-2000ms)
  3. Cubic spline interpolation to 2 Hz
  4. 120-sample windows (60s at 2 Hz) with 75% overlap

HRV Features (14):

  • Time-domain (7): Mean IBI, SDNN, RMSSD, pNN50, CV, Mean HR, Std HR
  • Frequency-domain (4): LF, HF, LF/HF, Total power
  • Nonlinear (3): SD1, SD2, SD1/SD2

EDA Processing:

  1. Detrending and lowpass filter (1 Hz cutoff)
  2. Downsample to 2 Hz
  3. 120-sample windows with 75% overlap

EDA Features (12):

  • Raw statistics (4): Mean, SD, Min, Max
  • Tonic component (4): Mean, SD, slope, range
  • Phasic component (4): Mean amplitude, SD, SCR count, Mean SCR height

Project Structure

multitask-stress-effort/
β”œβ”€β”€ prepare_multimodal_data_v5.py    # Raw signals β†’ ML features
β”œβ”€β”€ trainer_v8_masked_effortMultimodal.py  # Multi-task training
β”œβ”€β”€ Questionnaire.xlsx               # Labels from SWELL-KW
β”œβ”€β”€ Processed_IBI_HR_EDA/           # MATLAB-processed signals
β”‚   └── ppXX_date_cY_IBI_HR_EDA.mat
β”œβ”€β”€ prepared_data_v5_multimodal/    # ML-ready dataset
β”‚   β”œβ”€β”€ X_ibi.npy                   # (N, 120, 1)
β”‚   β”œβ”€β”€ X_hrv.npy                   # (N, 14)
β”‚   β”œβ”€β”€ X_eda.npy                   # (N, 120, 1)
β”‚   β”œβ”€β”€ X_eda_features.npy          # (N, 12)
β”‚   β”œβ”€β”€ y_stress.npy                # (N, 2) one-hot
β”‚   β”œβ”€β”€ y_effort.npy                # (N, 2) one-hot, masked=[0,0]
β”‚   β”œβ”€β”€ effort_mask.npy             # (N,) 1=valid, 0=masked
β”‚   └── ...
└── runs/                            # Training results
    └── synheart_v8_multimodal_*/

Training Configuration

Parameter Value
Optimizer AdamW
Learning Rate 2Γ—10⁻⁴
Weight Decay 1Γ—10⁻³
Batch Size 64
Max Epochs 200
Early Stopping 25 epochs patience
Loss Function Focal Loss (Ξ³=1.5) + Label Smoothing (Ξ΅=0.05)
Dropout 0.35 (encoders), 0.50 (fusion)
Augmentation Time Masking (p=0.5, width=15)

Citation

@article{author2025multitask,
  title={Simultaneous Detection of Emotional and Cognitive States: 
         A Multi-Task Deep Learning Approach},
  author={[Your Name]},
  journal={IEEE Transactions on Affective Computing},
  year={2025},
  note={Under review}
}

Requirements

tensorflow>=2.10.0
numpy>=1.21.0
scipy>=1.7.0
scikit-learn>=1.0.0
matplotlib>=3.5.0
pandas>=1.3.0
openpyxl>=3.0.0  # For Excel label files

Full list: requirements.txt


Key References

Multi-Task Learning:

  • Caruana (1997) - "Multitask Learning"
  • Ruder (2017) - "Multi-Task Learning in Deep Neural Networks"

Dataset & Cognitive Load:

  • Koldijk et al. (2014) - "SWELL Knowledge Work Dataset"
  • Monsell (2003) - "Task Switching"
  • Hockey (1997) - "Compensatory Control Theory"

License

MIT License - see LICENSE for details.


Contact

[Yisak T] - [yisak@synheart.ai] PhD


Multi-task framework for joint emotional and cognitive state detection from physiological signals
70.7% LOSO accuracy | Multimodal IBI+EDA fusion | Task-specific training