Henrychur
/

DiagAgent-7B

@@ -1,16 +1,21 @@
 ---
-license: apache-2.0
-language:
-- en
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
 tags:
 - medical
 - diagnosis
 - RL
 ---
 # DiagAgent-7B: RL-Optimized Diagnostic Agent
 <div align="center">
   <img src="https://raw.githubusercontent.com/MAGIC-AI4Med/DiagGym/main/assets/logo.png" width="150"/>
   <div align="center"></div>
@@ -24,7 +29,7 @@ DiagAgent‑7B is a reinforcement learning‑optimized large language model for
 DiagAgent‑7B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
-Details can be found in our paper https://arxiv.org/abs/2510.24654
 ## Quickstart
@@ -58,27 +63,45 @@ def chat(messages, max_new_tokens=1024, temperature=0.0):
 SYSTEM_PROMPT = (
     "You are a medical AI assistant. Analyze patient information, suggest relevant tests, "
-    "and provide a final diagnosis when sufficient information is available.\n\n"
-    "RESPONSE FORMAT:\n"
-    "If more information is needed:\n"
-    "```\n"
-    "Current diagnosis: <your current best diagnosis>\n"
-    "Based on the patient's initial presentation, the following investigation(s) should be performed: <one additional test>\n"
-    "Reason: <reason for the test>\n"
-    "```\n"
-    "If sufficient information exists for diagnosis:\n"
-    "```\n"
-    "The available information is sufficient to make a diagnosis.\n"
-    "Diagnosis: <final diagnosis>\n"
-    "Reason: <brief justification>\n"
     "```"
 )
 initial_inquiry = (
-    "- Patient Information: ___ y/o F\n"
-    "- Chief Complaint: Early satiety, weight loss, abdominal pain\n"
-    "- HPI: 1-month weight loss (10 lbs), early satiety, fatigue; prior emesis; reduced intake; denies fever/chills.\n"
-    "- PMH: Asthma, hyperlipidemia, HTN, osteoarthritis, polymyalgia rheumatica, CAD (NSTEMI), osteoporosis, H. pylori, s/p TAH/USO.\n"
     "- Allergy: Lisinopril."
 )
@@ -118,11 +141,9 @@ resp = client.chat.completions.create(
 print(resp.choices[0].message.content)
 ```
 ## Evaluation Results
-The following tables are taken directly from the project evaluation. For evaluation details and scripts, see the paper and the GitHub repository.
 **Single‑Turn Evaluation**
@@ -191,37 +212,73 @@ The following tables are taken directly from the project evaluation. For evaluat
 | MDAgent          | -    | 2024.10| 21.64            |
 | **Our Method**   |      |        |                  |
 | DiagAgent-14B    | 14B  | -      | 32.86            |
-## Training Details
-DiagAgent‑7B is optimized with multi‑turn RL (GRPO) inside `DiagGym`.
-- Trajectory construction:
-  - Initial inquiry (structured patient history without final diagnosis)
-  - Iterative steps: preliminary diagnosis → recommended exam + rationale → exam result
-  - Final diagnosis focused on a single primary condition
-- Reward design:
-  - Diagnosis Accuracy (exact match)
-  - Examination Recommendation F1 (overlap with reference EHR exams)
-  - Turn Penalty (discourages >12 interaction turns)
-For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
-## Citation
-```
 @misc{qiu2025evolvingdiagnosticagentsvirtual,
-      title={Evolving Diagnostic Agents in a Virtual Clinical Environment},
       author={Pengcheng Qiu and Chaoyi Wu and Junwei Liu and Qiaoyu Zheng and Yusheng Liao and Haowen Wang and Yun Yue and Qianrui Fan and Shuai Zhen and Jian Wang and Jinjie Gu and Yanfeng Wang and Ya Zhang and Weidi Xie},
       year={2025},
       eprint={2510.24654},
       archivePrefix={arXiv},
       primaryClass={cs.CL},
-      url={https://arxiv.org/abs/2510.24654},
 }
 ```
-## Contact
-- Email: henrychur@sjtu.edu.cn
-- GitHub: https://github.com/MAGIC-AI4Med/DiagGym

 ---
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
+language:
+- en
+license: apache-2.0
 tags:
 - medical
 - diagnosis
 - RL
+pipeline_tag: text-generation
+library_name: transformers
 ---
 # DiagAgent-7B: RL-Optimized Diagnostic Agent
+📄 [Paper](https://huggingface.co/papers/2510.24654) - 🌐 [Project Page](https://arxiv.org/html/2510.24654v1) - 💻 [Code](https://github.com/MAGIC-AI4Med/DiagGym)
 <div align="center">
   <img src="https://raw.githubusercontent.com/MAGIC-AI4Med/DiagGym/main/assets/logo.png" width="150"/>
   <div align="center"></div>
 DiagAgent‑7B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
+Details can be found in our paper [Evolving Diagnostic Agents in a Virtual Clinical Environment](https://huggingface.co/papers/2510.24654).
 ## Quickstart
 SYSTEM_PROMPT = (
     "You are a medical AI assistant. Analyze patient information, suggest relevant tests, "
+    "and provide a final diagnosis when sufficient information is available.
+"
+    "RESPONSE FORMAT:
+"
+    "If more information is needed:
+"
+    "```
+"
+    "Current diagnosis: <your current best diagnosis>
+"
+    "Based on the patient's initial presentation, the following investigation(s) should be performed: <one additional test>
+"
+    "Reason: <reason for the test>
+"
+    "```
+"
+    "If sufficient information exists for diagnosis:
+"
+    "```
+"
+    "The available information is sufficient to make a diagnosis.
+"
+    "Diagnosis: <final diagnosis>
+"
+    "Reason: <brief justification>
+"
     "```"
 )
 initial_inquiry = (
+    "- Patient Information: ___ y/o F
+"
+    "- Chief Complaint: Early satiety, weight loss, abdominal pain
+"
+    "- HPI: 1-month weight loss (10 lbs), early satiety, fatigue; prior emesis; reduced intake; denies fever/chills.
+"
+    "- PMH: Asthma, hyperlipidemia, HTN, osteoarthritis, polymyalgia rheumatica, CAD (NSTEMI), osteoporosis, H. pylori, s/p TAH/USO.\
+"\
     "- Allergy: Lisinopril."
 )
 print(resp.choices[0].message.content)
 ```
 ## Evaluation Results
+The following tables are taken directly from the project evaluation. For evaluation details and scripts, see the paper and the [GitHub repository](https://github.com/MAGIC-AI4Med/DiagGym).
 **Single‑Turn Evaluation**
 | MDAgent          | -    | 2024.10| 21.64            |
 | **Our Method**   |      |        |                  |
 | DiagAgent-14B    | 14B  | -      | 32.86            |
+---
+## Model Training
+### 🏥 DiagGym — Virtual Clinical Environment
+#### 📂 Data Construction
+We build **DiagGym Training Dataset** from the MIMIC‑IV EHR dataset by reorganizing each patient record into:
+- **Patient profile** — extracted from discharge notes (physical exam, chief complaint, history, allergies, family/social history, discharge diagnosis)
+- **Time‑ordered examination set** — chronologically sorted exams (lab, microbiology, radiology) linked with their results.
+The pipeline includes filtering (removing cases without physical exams or with pre‑established diagnoses), standardizing exam names, filling missing labels, and restricting to exams performed within one day before admission to ensure diagnostic relevance.
+<img src="https://raw.githubusercontent.com/MAGIC-AI4Med/DiagGym/main/assets/DiagGym_data_construction.png"/>
+Following the pipeline above, we obtain 118,478 patient EHRs, covering 4,897 distinct diseases.
+On average, each case contains 29 examinations (26 laboratory, 2 microbiology, 1 radiology).
+> **Note on Data Availability**: The data source for this work is MIMIC-IV. Due to licensing restrictions, we are unable to directly open-source the processed dataset. However, we are actively communicating with the relevant parties regarding the possibility of making the dataset publicly available on [PhysioNet](https://physionet.org/).
+#### ⚙️ Training Details
+**DiagGym** is trained as a conditional generative "EHR world model" that, given a patient profile and past examinations, generates the result of the next requested examination.
+We treat all exam results (textual or numeric) as free text and train with a standard token‑wise autoregressive loss.
+<img src="https://raw.githubusercontent.com/MAGIC-AI4Med/DiagGym/main/assets/DiagGym_training.png"/>
+For full training details and implementation code, see our [paper](https://huggingface.co/papers/2510.24654) and [training scripts](https://github.com/MAGIC-AI4Med/DiagGym/tree/main/DiagGym/train/).
+### 🤖 DiagAgent — RL‑Trained Diagnostic Agent
+#### 📂 Data Construction
+As shown in the figure below, we reformat DiagGym cases into **multi‑turn diagnostic trajectories** containing:
+- An **initial inquiry** (structured patient history without the final diagnosis)
+- Iterative steps of *preliminary diagnosis → recommended examination + rationale → exam result*
+- A **final diagnosis** focused on a single primary condition
+All trajectories are generated with DeepSeek‑v3 and filtered to prevent diagnosis leakage.
+<img src="https://raw.githubusercontent.com/MAGIC-AI4Med/DiagGym/main/assets/DiagAgent_data_construction.png"/>
+Following this pipeline, we obtain 16,270 interactive diagnostic trajectories
+#### ⚙️ Training Details
+DiagAgent is optimized with **end‑to‑end multi‑turn reinforcement learning (GRPO)** inside the DiagGym environment.
+In each rollout, the agent starts from an initial inquiry, interacts with DiagGym by recommending examinations and receiving simulated results, and decides when to make the final diagnosis.
+The reward combines three components:
+- **Diagnosis Accuracy** — 1 if the predicted diagnosis matches the ground truth, else 0
+- **Examination Recommendation F1** — overlap between recommended and reference exams from real EHRs
+- **Turn Penalty** — discourages excessive interaction turns beyond the set limit (12)
+<img src="https://raw.githubusercontent.com/MAGIC-AI4Med/DiagGym/main/assets/DiagAgent_training.png"/>
+For full training details and implementation code, see our [paper](https://huggingface.co/papers/2510.24654) and [training scripts](https://github.com/MAGIC-AI4Med/DiagGym/tree/main/DiagAgent/train/rl/).
+## 📝 Citation & Contact
+```bibtex
 @misc{qiu2025evolvingdiagnosticagentsvirtual,
+      title={Evolving Diagnostic Agents in a Virtual Clinical Environment},
       author={Pengcheng Qiu and Chaoyi Wu and Junwei Liu and Qiaoyu Zheng and Yusheng Liao and Haowen Wang and Yun Yue and Qianrui Fan and Shuai Zhen and Jian Wang and Jinjie Gu and Yanfeng Wang and Ya Zhang and Weidi Xie},
       year={2025},
       eprint={2510.24654},
       archivePrefix={arXiv},
       primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2510.24654},
 }
 ```
+For any inquiries or feedback, don’t hesitate to contact henrychur@sjtu.edu.cn.