Spaces:

salvinjose
/

HNTAI

Paused

Training Mismatch: Trained on news articles (CNN/DailyMail dataset), NOT medical text
Domain Gap: Cannot understand:
- Clinical terminology and medical abbreviations
- Structured visit data and medical codes
- ICD codes, medications, dosages
- Clinical narrative style
Not Instruction-Tuned: Cannot follow medical summarization instructions properly

What Happens: The model tries to summarize medical data as if it were a news article, resulting in nonsensical output that misses critical clinical information.

Solution: Use Phi-3-mini-4k-instruct-q4.gguf instead.

2. facebook/bart-large-cnn

Status: ⚠️ NOT RECOMMENDED FOR MEDICAL TEXT

Problem: Similar to Longformer:

Trained on news articles (CNN/DailyMail)
Limited medical domain knowledge
May produce suboptimal results for clinical text

Better Alternative: Use Phi-3-mini-4k-instruct-q4.gguf

✅ Recommended Models

1. microsoft/Phi-3-mini-4k-instruct-q4.gguf (PRIMARY - ACTIVE)

Why This Model?

✅ Instruction-tuned: Understands and follows complex medical summarization prompts ✅ General domain knowledge: Trained on diverse data including medical/technical content ✅ Efficient: GGUF quantization (Q4) provides excellent performance with lower resource usage ✅ Reliable: Produces coherent, relevant medical summaries ✅ Fast: CPU-optimized, works well in production

Configuration:

{
  "name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
  "type": "gguf",
  "is_active": true,
  "cached": true,
  "description": "Phi-3 Mini GGUF Q4 quantized - PRIMARY MODEL",
  "use_case": "Fast patient summary generation with CPU/GPU"
}

2. google/flan-t5-large (ALTERNATIVE)

Status: ✅ Good Alternative

Advantages:

Instruction-tuned (FLAN methodology)
Can follow summarization instructions
Smaller than Phi-3, faster inference
Better than BART/Longformer for structured text

Use When:

Need faster inference than Phi-3
Memory constraints
Simple summarization tasks

Technical Background: Why News Models Fail on Medical Text

Training Data Mismatch

News Articles (CNN/DailyMail):

Title: New Study Shows Coffee Benefits
Body: A recent study published in the Journal of Medicine found that...
Summary: Research indicates coffee may have health benefits including...

Medical Records:

Visit 2024-01-15:
Chief Complaint: SOB, DOE
HPI: 65F w/ PMH of HTN, DM2, presents with 3d progressive DOE...
PE: RRR, no m/r/g. Lungs CTAB. +1 bilateral LE edema...
A/P: 1. CHF exacerbation - start Lasix 40mg PO daily...

What News Models Do Wrong

Terminology: Can't understand medical abbreviations (SOB, DOE, HTN, DM2, CTAB, etc.)
Structure: Expect narrative news format, not clinical structured data
Priority: News models prioritize "interesting" content; medical needs prioritize clinical significance
Context: Medical context requires understanding relationships between symptoms, diagnoses, medications
Instructions: Cannot follow complex instructions like "generate a comprehensive clinical summary focusing on changes over time"

Migration Guide

If You're Currently Using Longformer or BART:

Step 1: Update your API request to use the recommended model:

{
  "patient_summarizer_model_name": "microsoft/Phi-3-mini-4k-instruct-gguf",
  "patient_summarizer_model_type": "gguf",
  "generation_mode": "gguf"
}

Step 2: Remove any model-name specification to use the default (Phi-3):

{
  // Just omit model specification - defaults to Phi-3
  "patientid": "12345",
  "token": "your-token",
  "key": "your-key"
}

Step 3: Test the output quality and adjust parameters if needed:

{
  "max_new_tokens": 2048,  // Adjust output length
  "temperature": 0.1,      // Lower = more focused, Higher = more creative
  "top_p": 0.5            // Lower = more deterministic
}

Configuration Reference

Current Active Configuration (models_config.json)

{
  "patient_summary_models": [
    {
      "name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
      "type": "gguf",
      "is_active": true,  // ← PRIMARY MODEL
      "cached": true,
      "description": "Phi-3 Mini GGUF Q4 quantized - PRIMARY MODEL",
      "use_case": "Fast patient summary generation with CPU/GPU",
      "repo_id": "microsoft/Phi-3-mini-4k-instruct-gguf",
      "filename": "Phi-3-mini-4k-instruct-q4.gguf"
    }
  ]
}

Performance Comparison

Model	Medical Text Quality	Speed	Memory	Instruction Following
Phi-3 GGUF Q4	⭐⭐⭐⭐⭐ Excellent	Fast	Low	✅ Yes
FLAN-T5 Large	⭐⭐⭐⭐ Good	Very Fast	Low	✅ Yes
Longformer	⭐ Poor (Irrelevant)	Slow	High	❌ No
BART-CNN	⭐⭐ Poor	Medium	Medium	❌ No

FAQs

Q: Can I still use Longformer/BART? A: Technically yes (they're still cached), but strongly not recommended. They will produce irrelevant summaries.

Q: Why are these models still in the config? A: For backward compatibility and documentation. They're marked as deprecated and is_active: false.

Q: What if Phi-3 is too slow? A: Try google/flan-t5-large as an alternative. Still instruction-tuned but smaller/faster.

Q: Can you fix Longformer to work with medical text? A: No. The model's training is fundamentally incompatible. Would require retraining on medical data.

Summary

✅ DO USE: Phi-3-mini-4k-instruct-q4.gguf (default/recommended) ✅ ALTERNATIVE: google/flan-t5-large
⚠️ AVOID: facebook/bart-large-cnn ❌ DO NOT USE: patrickvonplaten/longformer2roberta-cnn_dailymail-fp16

The Longformer model's irrelevant summaries are due to fundamental training mismatch with medical domain, not a bug that can be fixed.