mindsense-bert / README.md
maitry30's picture
Update README.md
13fec06 verified
metadata
language:
  - en
license: mit
tags:
  - mental-health
  - bert
  - text-classification
  - nlp
  - transformers
  - explainable-ai
  - shap
datasets:
  - kaggle-sentiment-analysis-mental-health
metrics:
  - accuracy
  - f1

🧠 MindSense-BERT β€” AI-Powered Mental Health Detection

Python BERT Accuracy Streamlit HuggingFace License

MindSense-BERT is a fine-tuned BERT-based model designed to classify text into 7 mental health categories using real-world Reddit data. It combines state-of-the-art NLP, class imbalance handling, and explainable AI (SHAP) to create a transparent and practical mental health analysis system.


πŸ“Œ Model Overview

  • Model Type: Transformer (BERT)

  • Base Model: bert-base-uncased

  • Task: Multi-class Text Classification

  • Classes (7):

    • Normal
    • Depression
    • Anxiety
    • Bipolar
    • PTSD
    • Stress
    • Personality Disorder
  • Framework: PyTorch + Hugging Face Transformers

  • Deployment: Streamlit + Hugging Face Hub


🎯 Problem Statement

Mental health conditions affect 1 in 4 people globally, but early detection is difficult due to stigma and lack of awareness.

This model aims to:

  • Analyze user-written text
  • Detect potential mental health conditions
  • Enable early awareness and intervention

πŸ“Š Dataset

  • Source: Kaggle β€” Sentiment Analysis for Mental Health
  • Size: 53,000+ Reddit posts
  • Input: statement (text)
  • Label: status (mental health category)

Categories:

  • Normal
  • Depression
  • Anxiety
  • Bipolar
  • PTSD
  • Stress
  • Personality Disorder

βš™οΈ Preprocessing

  • Removed URLs, special characters, duplicates
  • Handled missing values
  • Tokenized using BERT tokenizer
  • Lowercasing (uncased model)
  • Padding & truncation applied

πŸ”¬ Training Methodology

Phase 1 β€” Baseline Models

  • TF-IDF (10k features, uni + bi-grams)
  • Logistic Regression, Random Forest, SVM

Phase 2 β€” BERT Fine-tuning

  • Pretrained bert-base-uncased
  • 3 epochs
  • Learning rate: 1e-5
  • Batch size: 16
  • Trained on Google Colab T4 GPU

Phase 3 β€” Class Imbalance Handling

  • Weak classes identified: Stress, Personality Disorder

  • Techniques used:

    • Word swap
    • Random deletion
    • Key phrase duplication
  • Applied custom class weights


πŸ“ˆ Model Performance

Accuracy Comparison

Model Accuracy
Logistic Regression ~78%
Random Forest ~74%
SVM ~82%
BERT (initial) ~83%
BERT (final) ~87–89%

Per-Class F1 Score

Category F1 Score
Normal 0.91
Depression 0.89
Anxiety 0.86
Bipolar 0.84
PTSD 0.87
Stress 0.82
Personality Disorder 0.80

πŸ” Explainability (SHAP)

This model integrates SHAP (SHapley Additive Explanations) for transparency:

  • πŸ”΄ Red words β†’ push prediction toward a class
  • πŸ”΅ Blue words β†’ push prediction away

This improves trust and interpretability β€” critical in healthcare AI.


πŸš€ Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "maitry30/mindsense-bert"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "I feel completely hopeless and empty."

inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)

prediction = torch.argmax(outputs.logits, dim=1).item()
print(prediction)

πŸ§ͺ Example Predictions

Input Text Predicted Class
"I feel hopeless and empty" Depression
"I keep having nightmares" PTSD
"My mood swings a lot" Bipolar
"I feel fine today" Normal

πŸ›  Tech Stack

  • Python 3.10
  • PyTorch
  • Hugging Face Transformers
  • Scikit-learn
  • SHAP
  • Pandas, NumPy
  • Matplotlib, Seaborn
  • Streamlit (UI)
  • Hugging Face Hub (model hosting)

⚠️ Limitations

  • Limited to English text
  • May struggle with sarcasm or slang
  • Depends on dataset quality
  • Not suitable for real clinical diagnosis

βš–οΈ Ethical Considerations

  • This model is for educational purposes only
  • Not a replacement for mental health professionals
  • Should not be used for medical decisions
  • Predictions require human interpretation

πŸ” Safety Notice

  • ❌ Not a diagnostic tool
  • ❌ Not for emergency use
  • βœ… Can assist awareness and research

πŸš€ Future Work

  • Multilingual support (Hindi + regional languages)
  • Voice-based mental health detection
  • Multi-modal AI (text + physiological signals)
  • Explainable AI improvements
  • Production deployment (AWS/Azure)

πŸ‘€ Author

Maitry


πŸ“„ License

MIT License


⭐ If you found this model useful, consider giving the project a star!