toolevalxm commited on
Commit
3e67708
·
verified ·
1 Parent(s): f384593

Upload folder using huggingface_hub

Browse files
Files changed (6) hide show
  1. README.md +127 -0
  2. config.json +11 -0
  3. figures/fig1.png +0 -0
  4. figures/fig2.png +0 -0
  5. figures/fig3.png +0 -0
  6. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ ---
5
+ # MedDiagnoseAI
6
+ <!-- markdownlint-disable first-line-h1 -->
7
+ <!-- markdownlint-disable html -->
8
+ <!-- markdownlint-disable no-duplicate-header -->
9
+
10
+ <div align="center">
11
+ <img src="figures/fig1.png" width="60%" alt="MedDiagnoseAI" />
12
+ </div>
13
+ <hr>
14
+
15
+ <div align="center" style="line-height: 1;">
16
+ <a href="LICENSE" style="margin: 2px;">
17
+ <img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/>
18
+ </a>
19
+ </div>
20
+
21
+ ## 1. Introduction
22
+
23
+ MedDiagnoseAI represents a breakthrough in clinical decision support systems. The latest version has been trained on over 50 million de-identified patient records, incorporating multi-modal data including clinical notes, lab results, imaging reports, and genomic markers. The model demonstrates exceptional performance across various clinical benchmarks, approaching the diagnostic accuracy of board-certified physicians in many domains.
24
+
25
+ <p align="center">
26
+ <img width="80%" src="figures/fig3.png">
27
+ </p>
28
+
29
+ Compared to the previous version, MedDiagnoseAI v2.0 shows significant improvements in differential diagnosis tasks. For instance, in the MIMIC-IV diagnostic challenge, the model's F1-score has increased from 0.72 in the previous version to 0.89 in the current version. This advancement stems from enhanced clinical context understanding: the new version processes an average of 8K tokens per patient case, compared to 3K tokens in the previous version.
30
+
31
+ Beyond improved diagnostic capabilities, this version offers reduced false positive rates and enhanced support for multi-specialty consultations.
32
+
33
+ ## 2. Evaluation Results
34
+
35
+ ### Comprehensive Clinical Benchmark Results
36
+
37
+ <div align="center">
38
+
39
+ | | Benchmark | ModelA | ModelB | ModelA-v2 | MedDiagnoseAI |
40
+ |---|---|---|---|---|---|
41
+ | **Core Diagnostic Tasks** | Diagnosis Accuracy | 0.620 | 0.645 | 0.658 | 0.837 |
42
+ | | Clinical Reasoning | 0.701 | 0.718 | 0.735 | 0.779 |
43
+ | | Medical Knowledge | 0.752 | 0.769 | 0.781 | 0.755 |
44
+ | **Imaging & Analysis** | Radiology Interpretation | 0.589 | 0.612 | 0.628 | 0.733 |
45
+ | | Patient Q&A | 0.634 | 0.651 | 0.667 | 0.633 |
46
+ | | Disease Classification | 0.812 | 0.829 | 0.841 | 0.853 |
47
+ | | Symptom Severity | 0.723 | 0.738 | 0.752 | 0.717 |
48
+ | **Treatment Tasks** | Treatment Planning | 0.567 | 0.589 | 0.604 | 0.702 |
49
+ | | Clinical Documentation | 0.645 | 0.662 | 0.678 | 0.645 |
50
+ | | Patient Interaction | 0.698 | 0.715 | 0.729 | 0.685 |
51
+ | | Medical Summarization | 0.756 | 0.772 | 0.785 | 0.780 |
52
+ | **Specialized Capabilities**| Medical Terminology | 0.834 | 0.849 | 0.861 | 0.840 |
53
+ | | Literature Retrieval | 0.612 | 0.631 | 0.648 | 0.612 |
54
+ | | Protocol Adherence | 0.689 | 0.708 | 0.723 | 0.728 |
55
+ | | Drug Safety | 0.778 | 0.795 | 0.812 | 0.823 |
56
+
57
+ </div>
58
+
59
+ ### Overall Performance Summary
60
+ MedDiagnoseAI demonstrates strong performance across all evaluated clinical benchmark categories, with particularly notable results in diagnostic reasoning and drug safety evaluation tasks.
61
+
62
+ ## 3. Clinical Dashboard & API Platform
63
+ We offer a HIPAA-compliant clinical dashboard and API for healthcare institutions to integrate MedDiagnoseAI. Please check our official website for more details.
64
+
65
+ ## 4. How to Deploy Locally
66
+
67
+ Please refer to our deployment guide for information about running MedDiagnoseAI in your clinical environment.
68
+
69
+ Compared to previous versions, the deployment recommendations for MedDiagnoseAI have the following changes:
70
+
71
+ 1. HIPAA-compliant audit logging is now supported by default.
72
+ 2. Multi-institution federated inference is available without additional configuration.
73
+
74
+ The model architecture of MedDiagnoseAI-Light is optimized for edge deployment, while sharing the same clinical vocabulary as the main MedDiagnoseAI.
75
+
76
+ ### System Prompt
77
+ We recommend using the following clinical system prompt:
78
+ ```
79
+ You are MedDiagnoseAI, a clinical decision support assistant.
80
+ Current institution: {institution_name}
81
+ Date: {current_date}
82
+ IMPORTANT: All outputs require physician review before clinical action.
83
+ ```
84
+ For example,
85
+ ```
86
+ You are MedDiagnoseAI, a clinical decision support assistant.
87
+ Current institution: Johns Hopkins Hospital
88
+ Date: May 28, 2025, Monday.
89
+ IMPORTANT: All outputs require physician review before clinical action.
90
+ ```
91
+ ### Temperature
92
+ We recommend setting the temperature parameter $T_{model}$ to 0.3 for clinical applications to ensure consistent and reliable outputs.
93
+
94
+ ### Prompts for Patient Data Input
95
+ For patient record analysis, please follow the template to create prompts, where {patient_id}, {clinical_data} and {query} are arguments.
96
+ ```
97
+ patient_template = \
98
+ """[Patient ID]: {patient_id}
99
+ [Clinical Data Begin]
100
+ {clinical_data}
101
+ [Clinical Data End]
102
+ Clinical Query: {query}"""
103
+ ```
104
+ For literature-enhanced clinical reasoning, we recommend the following prompt template where {pubmed_results}, {cur_date}, and {clinical_question} are arguments.
105
+ ```
106
+ literature_search_template = \
107
+ '''# The following are relevant medical literature findings:
108
+ {pubmed_results}
109
+ In the literature I provide, each article is formatted as [Article X begin]...[Article X end]. Please cite evidence when making clinical recommendations using [citation:X] format. Multiple citations should be listed as [citation:3][citation:5].
110
+
111
+ When providing clinical guidance:
112
+ - Today is {cur_date}.
113
+ - Filter literature by relevance to the specific clinical scenario.
114
+ - Prioritize recent systematic reviews and RCTs over case reports.
115
+ - For treatment recommendations, include level of evidence.
116
+ - Always note when evidence is limited or conflicting.
117
+ - Include relevant contraindications and drug interactions.
118
+ - Synthesize findings from multiple sources when applicable.
119
+ # Clinical question:
120
+ {clinical_question}'''
121
+ ```
122
+
123
+ ## 5. License
124
+ This code repository is licensed under the [Apache 2.0 License](LICENSE). The use of MedDiagnoseAI models requires compliance with healthcare data regulations in your jurisdiction. The model is NOT approved for autonomous clinical decision-making.
125
+
126
+ ## 6. Contact
127
+ For research collaborations or institutional licensing, please contact us at medical@meddiagnoseai.health or submit an inquiry through our compliance portal.
config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "bioclinical-bert",
3
+ "architectures": [
4
+ "BioClinicalBertForSequenceClassification"
5
+ ],
6
+ "num_labels": 1500,
7
+ "hidden_size": 768,
8
+ "num_attention_heads": 12,
9
+ "vocab_size": 50000,
10
+ "clinical_domain": true
11
+ }
figures/fig1.png ADDED
figures/fig2.png ADDED
figures/fig3.png ADDED
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c8c8dc8efef21214f508d984cb90441fe95f390b77f2511a519eb7621fa80e0
3
+ size 412