Discourse-Aware Psycholinguistic Bangla Fake News Detector
This model combines BERT embeddings with psycholinguistic and discourse features for interpretable Bangla fake news detection.
Model Details
- Architecture: BERT + 17 psycholinguistic features + 5 discourse features
- Base Model: sagorsarker/bangla-bert-base
- Language: Bengali (Bangla)
- Task: 4-class fake news detection
- Training Data: 42,380 samples from BanFakeNews-2.0 dataset
Performance
- Test F1-Score: 84.37%
- Test Accuracy: 85.73%
- Validation F1: 84.65%
- Classes: 4 categories (0-3)
Key Innovation
This is the first systematic integration of psycholinguistic theory with deep learning for Bangla fake news detection, enabling explainable predictions while maintaining state-of-the-art performance.
Features Extracted
Psycholinguistic Features (17):
- Emotional intensity markers (fear, anger, positive/negative sentiment)
- Uncertainty and hedging patterns
- Cognitive load indicators (repetition, disfluency)
- Deception-specific linguistic patterns (self-reference, present tense usage)
Discourse Features (5):
- Text coherence scores across paragraphs
- Argumentative structure analysis (claims vs evidence)
- Topic progression and transition patterns
Model Architecture
The model integrates:
- BERT contextual embeddings (768 dimensions)
- Psycholinguistic features (17 dimensions)
- Discourse features (5 dimensions)
- Feature fusion layer
- Final classification head
Usage
# Note: This model requires custom feature extraction
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("NUHASHROXME/bangla-fake-news-interpretable")
# Custom model loading code required for interpretable features
- Downloads last month
- 3
Model tree for NUHASHROXME/bangla-fake-news-interpretable
Base model
sagorsarker/bangla-bert-baseEvaluation results
- Test F1 Score on Bangla Fake News Dataset 2.0test set self-reported0.844
- Test Accuracy on Bangla Fake News Dataset 2.0test set self-reported0.857