Upload folder using huggingface_hub
Browse files- README.md +127 -0
- config.json +11 -0
- figures/fig1.png +0 -0
- figures/fig2.png +0 -0
- figures/fig3.png +0 -0
- pytorch_model.bin +3 -0
README.md
ADDED
|
@@ -0,0 +1,127 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: transformers
|
| 4 |
+
---
|
| 5 |
+
# MedDiagnoseAI
|
| 6 |
+
<!-- markdownlint-disable first-line-h1 -->
|
| 7 |
+
<!-- markdownlint-disable html -->
|
| 8 |
+
<!-- markdownlint-disable no-duplicate-header -->
|
| 9 |
+
|
| 10 |
+
<div align="center">
|
| 11 |
+
<img src="figures/fig1.png" width="60%" alt="MedDiagnoseAI" />
|
| 12 |
+
</div>
|
| 13 |
+
<hr>
|
| 14 |
+
|
| 15 |
+
<div align="center" style="line-height: 1;">
|
| 16 |
+
<a href="LICENSE" style="margin: 2px;">
|
| 17 |
+
<img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/>
|
| 18 |
+
</a>
|
| 19 |
+
</div>
|
| 20 |
+
|
| 21 |
+
## 1. Introduction
|
| 22 |
+
|
| 23 |
+
MedDiagnoseAI represents a breakthrough in clinical decision support systems. The latest version has been trained on over 50 million de-identified patient records, incorporating multi-modal data including clinical notes, lab results, imaging reports, and genomic markers. The model demonstrates exceptional performance across various clinical benchmarks, approaching the diagnostic accuracy of board-certified physicians in many domains.
|
| 24 |
+
|
| 25 |
+
<p align="center">
|
| 26 |
+
<img width="80%" src="figures/fig3.png">
|
| 27 |
+
</p>
|
| 28 |
+
|
| 29 |
+
Compared to the previous version, MedDiagnoseAI v2.0 shows significant improvements in differential diagnosis tasks. For instance, in the MIMIC-IV diagnostic challenge, the model's F1-score has increased from 0.72 in the previous version to 0.89 in the current version. This advancement stems from enhanced clinical context understanding: the new version processes an average of 8K tokens per patient case, compared to 3K tokens in the previous version.
|
| 30 |
+
|
| 31 |
+
Beyond improved diagnostic capabilities, this version offers reduced false positive rates and enhanced support for multi-specialty consultations.
|
| 32 |
+
|
| 33 |
+
## 2. Evaluation Results
|
| 34 |
+
|
| 35 |
+
### Comprehensive Clinical Benchmark Results
|
| 36 |
+
|
| 37 |
+
<div align="center">
|
| 38 |
+
|
| 39 |
+
| | Benchmark | ModelA | ModelB | ModelA-v2 | MedDiagnoseAI |
|
| 40 |
+
|---|---|---|---|---|---|
|
| 41 |
+
| **Core Diagnostic Tasks** | Diagnosis Accuracy | 0.620 | 0.645 | 0.658 | 0.837 |
|
| 42 |
+
| | Clinical Reasoning | 0.701 | 0.718 | 0.735 | 0.779 |
|
| 43 |
+
| | Medical Knowledge | 0.752 | 0.769 | 0.781 | 0.755 |
|
| 44 |
+
| **Imaging & Analysis** | Radiology Interpretation | 0.589 | 0.612 | 0.628 | 0.733 |
|
| 45 |
+
| | Patient Q&A | 0.634 | 0.651 | 0.667 | 0.633 |
|
| 46 |
+
| | Disease Classification | 0.812 | 0.829 | 0.841 | 0.853 |
|
| 47 |
+
| | Symptom Severity | 0.723 | 0.738 | 0.752 | 0.717 |
|
| 48 |
+
| **Treatment Tasks** | Treatment Planning | 0.567 | 0.589 | 0.604 | 0.702 |
|
| 49 |
+
| | Clinical Documentation | 0.645 | 0.662 | 0.678 | 0.645 |
|
| 50 |
+
| | Patient Interaction | 0.698 | 0.715 | 0.729 | 0.685 |
|
| 51 |
+
| | Medical Summarization | 0.756 | 0.772 | 0.785 | 0.780 |
|
| 52 |
+
| **Specialized Capabilities**| Medical Terminology | 0.834 | 0.849 | 0.861 | 0.840 |
|
| 53 |
+
| | Literature Retrieval | 0.612 | 0.631 | 0.648 | 0.612 |
|
| 54 |
+
| | Protocol Adherence | 0.689 | 0.708 | 0.723 | 0.728 |
|
| 55 |
+
| | Drug Safety | 0.778 | 0.795 | 0.812 | 0.823 |
|
| 56 |
+
|
| 57 |
+
</div>
|
| 58 |
+
|
| 59 |
+
### Overall Performance Summary
|
| 60 |
+
MedDiagnoseAI demonstrates strong performance across all evaluated clinical benchmark categories, with particularly notable results in diagnostic reasoning and drug safety evaluation tasks.
|
| 61 |
+
|
| 62 |
+
## 3. Clinical Dashboard & API Platform
|
| 63 |
+
We offer a HIPAA-compliant clinical dashboard and API for healthcare institutions to integrate MedDiagnoseAI. Please check our official website for more details.
|
| 64 |
+
|
| 65 |
+
## 4. How to Deploy Locally
|
| 66 |
+
|
| 67 |
+
Please refer to our deployment guide for information about running MedDiagnoseAI in your clinical environment.
|
| 68 |
+
|
| 69 |
+
Compared to previous versions, the deployment recommendations for MedDiagnoseAI have the following changes:
|
| 70 |
+
|
| 71 |
+
1. HIPAA-compliant audit logging is now supported by default.
|
| 72 |
+
2. Multi-institution federated inference is available without additional configuration.
|
| 73 |
+
|
| 74 |
+
The model architecture of MedDiagnoseAI-Light is optimized for edge deployment, while sharing the same clinical vocabulary as the main MedDiagnoseAI.
|
| 75 |
+
|
| 76 |
+
### System Prompt
|
| 77 |
+
We recommend using the following clinical system prompt:
|
| 78 |
+
```
|
| 79 |
+
You are MedDiagnoseAI, a clinical decision support assistant.
|
| 80 |
+
Current institution: {institution_name}
|
| 81 |
+
Date: {current_date}
|
| 82 |
+
IMPORTANT: All outputs require physician review before clinical action.
|
| 83 |
+
```
|
| 84 |
+
For example,
|
| 85 |
+
```
|
| 86 |
+
You are MedDiagnoseAI, a clinical decision support assistant.
|
| 87 |
+
Current institution: Johns Hopkins Hospital
|
| 88 |
+
Date: May 28, 2025, Monday.
|
| 89 |
+
IMPORTANT: All outputs require physician review before clinical action.
|
| 90 |
+
```
|
| 91 |
+
### Temperature
|
| 92 |
+
We recommend setting the temperature parameter $T_{model}$ to 0.3 for clinical applications to ensure consistent and reliable outputs.
|
| 93 |
+
|
| 94 |
+
### Prompts for Patient Data Input
|
| 95 |
+
For patient record analysis, please follow the template to create prompts, where {patient_id}, {clinical_data} and {query} are arguments.
|
| 96 |
+
```
|
| 97 |
+
patient_template = \
|
| 98 |
+
"""[Patient ID]: {patient_id}
|
| 99 |
+
[Clinical Data Begin]
|
| 100 |
+
{clinical_data}
|
| 101 |
+
[Clinical Data End]
|
| 102 |
+
Clinical Query: {query}"""
|
| 103 |
+
```
|
| 104 |
+
For literature-enhanced clinical reasoning, we recommend the following prompt template where {pubmed_results}, {cur_date}, and {clinical_question} are arguments.
|
| 105 |
+
```
|
| 106 |
+
literature_search_template = \
|
| 107 |
+
'''# The following are relevant medical literature findings:
|
| 108 |
+
{pubmed_results}
|
| 109 |
+
In the literature I provide, each article is formatted as [Article X begin]...[Article X end]. Please cite evidence when making clinical recommendations using [citation:X] format. Multiple citations should be listed as [citation:3][citation:5].
|
| 110 |
+
|
| 111 |
+
When providing clinical guidance:
|
| 112 |
+
- Today is {cur_date}.
|
| 113 |
+
- Filter literature by relevance to the specific clinical scenario.
|
| 114 |
+
- Prioritize recent systematic reviews and RCTs over case reports.
|
| 115 |
+
- For treatment recommendations, include level of evidence.
|
| 116 |
+
- Always note when evidence is limited or conflicting.
|
| 117 |
+
- Include relevant contraindications and drug interactions.
|
| 118 |
+
- Synthesize findings from multiple sources when applicable.
|
| 119 |
+
# Clinical question:
|
| 120 |
+
{clinical_question}'''
|
| 121 |
+
```
|
| 122 |
+
|
| 123 |
+
## 5. License
|
| 124 |
+
This code repository is licensed under the [Apache 2.0 License](LICENSE). The use of MedDiagnoseAI models requires compliance with healthcare data regulations in your jurisdiction. The model is NOT approved for autonomous clinical decision-making.
|
| 125 |
+
|
| 126 |
+
## 6. Contact
|
| 127 |
+
For research collaborations or institutional licensing, please contact us at medical@meddiagnoseai.health or submit an inquiry through our compliance portal.
|
config.json
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_type": "bioclinical-bert",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"BioClinicalBertForSequenceClassification"
|
| 5 |
+
],
|
| 6 |
+
"num_labels": 1500,
|
| 7 |
+
"hidden_size": 768,
|
| 8 |
+
"num_attention_heads": 12,
|
| 9 |
+
"vocab_size": 50000,
|
| 10 |
+
"clinical_domain": true
|
| 11 |
+
}
|
figures/fig1.png
ADDED
|
figures/fig2.png
ADDED
|
figures/fig3.png
ADDED
|
pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2c8c8dc8efef21214f508d984cb90441fe95f390b77f2511a519eb7621fa80e0
|
| 3 |
+
size 412
|