Update README.md

0893472 verified 2 months ago

4.93 kB

	---
	language:
	- en

	license: apache-2.0

	tags:
	- semeval
	- semeval-2026
	- emotion
	- affect-prediction
	- temporal-nlp
	- transformers
	- roberta

	datasets:
	- semeval

	pipeline_tag: text-classification

	library_name: transformers

	metrics:
	- pearson-correlation
	---

	# AffectDynamics (Team AGI) — Longitudinal Affect Prediction Model

	AffectDynamics is a temporal affect modeling system developed for SemEval-2026 Task 2: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays.

	The model predicts emotional valence and arousal from longitudinal text written by users across time. It combines transformer-based text encoding with temporal modeling and user-level conditioning to capture both stable emotional baselines and dynamic emotional changes.

	---

	# Model Details

	Model name: AffectDynamics-SemEval2026Task2
	Developer: Harsh Rathva
	Institution: Sardar Vallabhbhai National Institute of Technology (SVNIT), Surat
	Email: u24ai036@aid.svnit.ac.in

	## Architecture

	The system consists of four main components:

	### 1. Text Encoder
	- RoBERTa-Large transformer encoder
	- Produces contextual embeddings for each text input.

	Different pooling strategies are used depending on text type:

	- Essays → CLS / pooler representation
	- Feeling word lists → mean pooled token embeddings

	### 2. Temporal Encoder
	- Unidirectional GRU
	- Models longitudinal emotional dynamics across user timelines
	- Ensures causal temporal modeling (no future information leakage)

	### 3. User Conditioning
	- Gated user embedding
	- Uses user statistics such as:
	- number of samples
	- timeline length
	- emotional entropy

	This allows interpolation between user-specific and global representations.

	### 4. Prediction Heads

	\| Task \| Description \|
	\|-----\|-------------\|
	\| Subtask 1 (S1) \| Absolute valence and arousal prediction \|
	\| Subtask 2A (S2A) \| Short-term emotional state change prediction \|
	\| Subtask 2B (S2B) \| Long-term dispositional change prediction \|

	---

	# Training Data

	The model was trained using the official SemEval-2026 Task 2 dataset.

	### Dataset statistics

	- Total texts: 5,285
	- Training texts: 2,764
	- Users: 182 total (137 training users)
	- Time span: 2021–2024

	Each entry contains:

	\| Field \| Description \|
	\|------\|-------------\|
	\| user_id \| Anonymous user identifier \|
	\| text \| Ecological essay or feeling word list \|
	\| timestamp \| Time of writing \|
	\| collection_phase \| Study phase \|
	\| valence \| Emotional valence (-2 to 2) \|
	\| arousal \| Emotional arousal (0 to 2) \|

	The texts were written by U.S. service-industry workers describing their emotional state.

	---

	# Training Details

	### Optimization

	- Optimizer: AdamW
	- Scheduler: OneCycleLR
	- Batch size: 4
	- Training epochs: 10

	### Learning Rates

	\| Component \| Learning Rate \|
	\|----------\|---------------\|
	\| RoBERTa encoder \| 2e-6 \|
	\| GRU \| 3e-4 \|
	\| Task heads \| 2e-5 \|

	### Loss Functions

	\| Task \| Loss \|
	\|----\|----\|
	\| Subtask 1 \| Ordinal regression with label smoothing \|
	\| Subtask 2A \| Smooth L1 loss \|
	\| Subtask 2B \| Mean squared error \|

	---

	# Evaluation Results

	Official evaluation results from SemEval-2026 Task 2:

	\| Task \| Metric \| Valence \| Arousal \|
	\|----\|----\|----\|----\|
	\| Subtask 1 \| Composite correlation \| 0.600 \| 0.452 \|
	\| Subtask 2A \| Pearson correlation \| -0.167 \| -0.147 \|
	\| Subtask 2B \| Pearson correlation \| 0.086 \| -0.081 \|

	The model demonstrates strong performance on absolute affect prediction, but exhibits limitations in change detection tasks, highlighting a trade-off between temporal stability and sensitivity to emotional transitions.

	---

	# Intended Use

	This model is intended for research purposes, including:

	- longitudinal affect modeling
	- emotion prediction from text
	- temporal NLP modeling
	- ecological momentary assessment analysis

	---

	# Limitations

	1. Stability bias
	- Temporal modeling smooths predictions and reduces sensitivity to abrupt changes.

	2. Dataset domain
	- Data originates from a specific population (U.S. service-industry workers).

	3. Limited users
	- Only 137 users in training data.

	4. Change prediction difficulty
	- Predicting emotional deltas is harder than predicting absolute states.

	---

	# Ethical Considerations

	Emotion prediction models must be used responsibly.

	Potential concerns include:

	- privacy risks from modeling personal emotional data
	- misuse for manipulation or surveillance
	- dataset demographic bias

	This model should not be used for clinical or psychological diagnosis.

	---

	# Reproducibility

	Code and training pipeline:

	https://github.com/ezylopx5/AffectDynamics-SemEval2026Task2

	Model weights:

	https://huggingface.co/Haxxsh/AffectDynamics-SemEval2026Task2

	---

	# Citation

	If you use this model, please cite: