Spaces:

DTanzillo
/

medclear

Running

App Files Files Community

DTanzillo commited on 27 days ago

Commit

02e0bd3

verified ·

1 Parent(s): d8ffedc

Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README_model.md +70 -0
app.py +55 -1
requirements.txt +2 -0

README_model.md ADDED Viewed

	@@ -0,0 +1,70 @@

+---
+license: apache-2.0
+language:
+  - en
+pipeline_tag: summarization
+tags:
+  - medical
+  - simplification
+  - health-literacy
+  - flan-t5
+  - plain-language
+datasets:
+  - GEM/cochrane-simplification
+  - tttamayo/Med-EASi
+base_model: google/flan-t5-base
+---
+# MedClear V2: Medical Text Simplification
+**MedClear** translates doctor-speak into human-speak. Fine-tuned FLAN-T5-base (248M params) that simplifies clinical notes, medical terms, and discharge summaries into plain language patients can understand.
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("DTanzillo/medclear-v2-base")
+model = AutoModelForSeq2SeqLM.from_pretrained("DTanzillo/medclear-v2-base")
+text = "simplify: Patient underwent laparoscopic cholecystectomy for acute cholecystitis. EBL minimal. POD1: afebrile, tolerating PO diet."
+inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)
+outputs = model.generate(**inputs, max_new_tokens=256, num_beams=4)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Training Data
+23,157 examples across multiple granularity levels:
+| Level | Examples | % |
+|-------|----------|---|
+| Terms | 4,989 | 21.5% |
+| Phrases | 6,660 | 28.8% |
+| Sentences | 8,000 | 34.5% |
+| Flashcards | 2,689 | 11.6% |
+| Paragraphs | 574 | 2.5% |
+| RAG-augmented | 245 | 1.1% |
+Key insight: 50% of training is term/phrase level. The model learns vocabulary mappings first, then composes them into simplified text.
+## Results
+| Metric | Raw FLAN-T5 | MedClear |
+|--------|-------------|----------|
+| ROUGE-1 F1 | 0.13 | **0.36** |
+| ROUGE-2 F1 | 0.05 | **0.13** |
+| ROUGE-L F1 | 0.10 | **0.22** |
+| Eval Loss | -- | **1.712** |
+## Limitations
+- Can hallucinate on complex multi-fact clinical notes
+- Best used with RAG pipeline (MedlinePlus) for verification
+- Not a substitute for professional medical advice
+## Demo
+Try the live demo: [MedClear on HuggingFace Spaces](https://huggingface.co/spaces/DTanzillo/medclear)
+**Duke University Hackathon 2026**

app.py CHANGED Viewed

@@ -269,5 +269,59 @@ demo = gr.Interface(
     theme=gr.themes.Soft(),
 )
 if __name__ == "__main__":
-    demo.launch()

     theme=gr.themes.Soft(),
 )
+# Mount a Flask API so the React frontend can call /api/simplify
+import json
+from flask import Flask, request as flask_request, jsonify
+from flask_cors import CORS
+flask_app = Flask(__name__)
+CORS(flask_app)
+@flask_app.route("/api/simplify", methods=["POST"])
+def api_simplify():
+    data = flask_request.get_json()
+    if not data or "text" not in data:
+        return jsonify({"error": "Missing 'text' field"}), 400
+    clinical_text = data["text"]
+    plain_language, _ = simplify(clinical_text)
+    # Build structured annotations for React frontend
+    terms = find_terms(clinical_text)
+    annotations = []
+    for term_text, simple_def in terms:
+        pattern = re.compile(r'\b' + re.escape(term_text) + r'\b', re.IGNORECASE)
+        match = pattern.search(clinical_text)
+        if match:
+            ml = search_medlineplus(term_text)
+            ml_url = ml["url"] if ml else f"https://medlineplus.gov/search/?query={urllib.parse.quote(term_text)}"
+            ml_summary = ml["summary"] if ml else ""
+            annotations.append({
+                "term": match.group(),
+                "simple": simple_def,
+                "start": match.start(),
+                "end": match.end(),
+                "url": ml_url,
+                "medlineplus_summary": ml_summary,
+            })
+    annotations.sort(key=lambda x: x["start"])
+    return jsonify({
+        "input": clinical_text,
+        "plain_language": plain_language,
+        "source_annotations": annotations,
+        "output_annotations": [],
+    })
+@flask_app.route("/api/health", methods=["GET"])
+def api_health():
+    return jsonify({"status": "ok", "model_loaded": True})
+# Mount Flask app inside Gradio
+demo = gr.mount_gradio_app(flask_app, demo, path="/")
 if __name__ == "__main__":
+    flask_app.run(host="0.0.0.0", port=7860)

requirements.txt CHANGED Viewed

@@ -1,3 +1,5 @@
 torch
 transformers
 gradio

 torch
 transformers
 gradio
+flask
+flask-cors