Discourse-Aware Psycholinguistic Bangla Fake News Detector

This model combines BERT embeddings with psycholinguistic and discourse features for interpretable Bangla fake news detection.

Model Details

Architecture: BERT + 17 psycholinguistic features + 5 discourse features
Base Model: sagorsarker/bangla-bert-base
Language: Bengali (Bangla)
Task: 4-class fake news detection
Training Data: 42,380 samples from BanFakeNews-2.0 dataset

Performance

Test F1-Score: 84.37%
Test Accuracy: 85.73%
Validation F1: 84.65%
Classes: 4 categories (0-3)

Key Innovation

This is the first systematic integration of psycholinguistic theory with deep learning for Bangla fake news detection, enabling explainable predictions while maintaining state-of-the-art performance.

Features Extracted

Psycholinguistic Features (17):

Emotional intensity markers (fear, anger, positive/negative sentiment)
Uncertainty and hedging patterns
Cognitive load indicators (repetition, disfluency)
Deception-specific linguistic patterns (self-reference, present tense usage)

Discourse Features (5):

Text coherence scores across paragraphs
Argumentative structure analysis (claims vs evidence)
Topic progression and transition patterns

Model Architecture

The model integrates:

BERT contextual embeddings (768 dimensions)
Psycholinguistic features (17 dimensions)
Discourse features (5 dimensions)
Feature fusion layer
Final classification head

Usage

# Note: This model requires custom feature extraction
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("NUHASHROXME/bangla-fake-news-interpretable")
# Custom model loading code required for interpretable features

Downloads last month: 4

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for NUHASHROXME/bangla-fake-news-interpretable

Base model

sagorsarker/bangla-bert-base

Finetuned

(32)

this model

Evaluation results

Test F1 Score on Bangla Fake News Dataset 2.0
test set self-reported

0.844
Test Accuracy on Bangla Fake News Dataset 2.0
test set self-reported

0.857