File size: 4,779 Bytes

0bff77d
 
 
 
 
 
 
 
 
 
 
775ea8e
 
0bff77d
 
 
 
 
 
 
775ea8e
0bff77d
775ea8e
0bff77d
 
775ea8e
ac0d5f8
775ea8e
ac0d5f8
775ea8e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ac0d5f8
 
 
775ea8e
 
 
 
 
 
ac0d5f8
775ea8e
ac0d5f8
 
 
 
 
775ea8e
 
46a6d24
ac0d5f8
 
 
3163779
 
 
 
 
 
 
 
ac0d5f8
 
 
 
 
775ea8e
ac0d5f8
 
 
 
 
 
775ea8e
ac0d5f8
 
 
 
 
775ea8e
ac0d5f8
 
 
775ea8e
ac0d5f8
775ea8e
 
 
 
 
ac0d5f8
 
 
1409325
 
 
 
 
 
 
 
 
 
775ea8e
ac0d5f8
775ea8e
 
 
 
 
 
ac0d5f8
775ea8e
ac0d5f8
 
775ea8e
ac0d5f8
775ea8e
 
 
 
 
 
 
 
ac0d5f8
775ea8e
ac0d5f8
775ea8e
ac0d5f8
775ea8e

---
language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - emotion-classification
  - healthcare
  - distilbert
  - patient-doctor-conversations
  - text-classification
  - clinical-AI
  - mental-health
model_index:
  - name: patient-emotion-classifier
    results:
      - task:
          type: text-classification
        metrics:
          - type: accuracy
            value: 0.713
          - type: f1
            value: 0.722
---

<div align="center">

# 🤖 Patient Emotion Classifier

**Advanced AI-Powered Emotion Recognition for Healthcare Dialogues**

*Part of the Blended AI+X Initiative — Bridging Artificial Intelligence and Healthcare*

---

[![Model](https://img.shields.io/badge/Model-DistilBERT-blue)](https://huggingface.co/distilbert/distilbert-base-uncased)
[![License](https://img.shields.io/badge/License-Apache--2.0-green)](LICENSE)
[![Performance](https://img.shields.io/badge/F1--Score-72.2%25-orange)]()

</div>

## 🔬 Overview

We are thrilled to introduce **Patient Emotion Classifier**, a state-of-the-art NLP model engineered to understand emotional nuances in patient-doctor conversations. 

This model represents our commitment to advancing **AI for Healthcare (AI+X)**, leveraging cutting-edge transformer architectures to bridge the gap between artificial intelligence and compassionate care.

### Key Capabilities

- **Multiclass Emotion Recognition** — Identifies 6 distinct emotional states in clinical dialogues
- **Healthcare-Optimized** — Specifically trained on medical conversation data
- **Production-Ready** — Deployable via REST API for real-time inference
- **Lightweight & Efficient** — Built on DistilBERT for fast inference

## 🎯 Emotion Categories

Our model classifies emotional states into **6 clinically-relevant categories**:

| Category | Description |
|----------|-------------|
| 😐 **Neutral** | Objective, non-emotional statements |
| 😰 **Anxiety/Fear** | Patient expresses worry, concern, or fear |
| 😠 **Anger/Frustration** | Patient shows frustration or displeasure |
| 😢 **Sadness/Helplessness** | Patient feels down or hopeless |
| 🤔 **Confusion/Doubt** | Patient expresses uncertainty or questions |
| 🙏 **Gratitude/Relief** | Patient conveys thanks or relief |

## 📊 Model Performance

### Overall Metrics

| Metric | Value |
|--------|-------|
| **Accuracy** | **71.3%** |
| **Macro F1** | **0.722** |
| Weighted F1 | 0.72 |

### Per-Class Performance

| Emotion | Precision | Recall | F1-Score |
|---------|-----------|--------|----------|
| Neutral | 0.75 | 0.78 | 0.76 |
| Anxiety/Fear | 0.52 | 0.63 | 0.57 |
| Anger/Frustration | 0.80 | 0.73 | 0.76 |
| Sadness/Helplessness | 0.65 | 0.55 | 0.60 |
| Confusion/Doubt | 0.60 | 0.58 | 0.59 |
| Gratitude/Relief | 0.72 | 0.75 | 0.73 |

### Label Distribution

![Label Distribution](data/label_distribution.png)

## 🚀 Quick Start

### 1. Install Dependencies

```bash
pip install -r requirements.txt

### 2. Launch the Service

```bash
cd see
python app.py

### 3. Access the Interface

http://localhost:8002

## 📚 Dataset

This model was trained on a meticulously curated subset of medical dialogues:

- **Original Source**: [Chinese MedDialog Dataset](https://tianchi.aliyun.com/dataset/92110) — Alibaba Cloud Tianchi
- **Post-Processing**: Carefully filtered, translated, and annotated for emotion classification
- **Total Samples**: 28,280 annotated dialogues
- **Categories**: 6 emotion labels
- **Language**: English

## 📚 References

1. **MedDialog Dataset**  
   Chinese Medical Dialogue Dataset. Alibaba Cloud Tianchi.  
   https://tianchi.aliyun.com/dataset/92110

2. **DistilBERT**  
   Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). *DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter*. arXiv preprint arXiv:1910.01108.  
   https://arxiv.org/abs/1910.01108

## 🏗️ Technical Specifications

| Component | Details |
|-----------|---------|
| **Base Architecture** | [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) |
| **Task Type** | 6-class emotion classification |
| **Max Sequence Length** | 512 tokens |
| **Framework** | PyTorch + Transformers |

## 📁 Project Structure

patient-emotion-analysis/
├── best_model/          # Fine-tuned model weights
├── see/                 # Inference service
│   ├── app.py          # Web application
│   ├── inference.py    # Core inference logic
│   └── templates/      # UI templates
├── data/               # Training & evaluation data
├── requirements.txt    # Dependencies
└── README.md           # This file

---

<div align="center">

**Blended AI+X Initiative** — *Advancing Healthcare Through Intelligence*

</div>