Add pipeline tag, library name, paper, project, and GitHub links

This PR significantly improves the model card for `Henrychur/DiagAgent-14B` by:

- Adding `pipeline_tag: text-generation` to correctly classify the model's functionality and enable the inference widget on the Hugging Face Hub.
- Adding `library_name: transformers` to integrate with the automated "how to use" widget, as evidenced by the `transformers` code snippets in the "Quickstart" section.
- Adding explicit links to the [paper](https://arxiv.org/abs/2510.24654), [project page](https://arxiv.org/html/2510.24654v1), and [GitHub repository](https://github.com/MAGIC-AI4Med/DiagGym) at the top of the card for easy access.
- Updating relative GitHub links in the "Evaluation Results" and "Training Details" sections to absolute URLs for improved navigation.
- Removing redundant paper link text in the description to keep the model card concise.
- Formatting the GitHub link in the "Contact" section as a Markdown link.

These updates enhance the model's discoverability, usability, and overall documentation quality.

Files changed (1) hide show

README.md +50 -30

README.md CHANGED Viewed

@@ -1,14 +1,17 @@
 ---
-license: apache-2.0
-language:
-- en
 base_model:
 - Qwen/Qwen2.5-14B-Instruct
 tags:
 - medical
 - diagnosis
 - RL
 ---
 # DiagAgent-14B: RL-Optimized Diagnostic Agent
 <div align="center">
@@ -16,6 +19,10 @@ tags:
   <div align="center"></div>
 </div>
 DiagAgent‑14B is a reinforcement learning‑optimized large language model for interactive, multi‑turn diagnostic reasoning. Unlike one‑shot medical LLMs, it can:
 - recommend the most informative examinations,
@@ -24,8 +31,6 @@ DiagAgent‑14B is a reinforcement learning‑optimized large language model for
 DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
-Details can be found in our paper https://arxiv.org/abs/2510.24654
 ## Quickstart
 ### Run locally with `transformers`
@@ -58,27 +63,45 @@ def chat(messages, max_new_tokens=1024, temperature=0.0):
 SYSTEM_PROMPT = (
     "You are a medical AI assistant. Analyze patient information, suggest relevant tests, "
-    "and provide a final diagnosis when sufficient information is available.\n\n"
-    "RESPONSE FORMAT:\n"
-    "If more information is needed:\n"
-    "```\n"
-    "Current diagnosis: <your current best diagnosis>\n"
-    "Based on the patient's initial presentation, the following investigation(s) should be performed: <one additional test>\n"
-    "Reason: <reason for the test>\n"
-    "```\n"
-    "If sufficient information exists for diagnosis:\n"
-    "```\n"
-    "The available information is sufficient to make a diagnosis.\n"
-    "Diagnosis: <final diagnosis>\n"
-    "Reason: <brief justification>\n"
     "```"
 )
 initial_inquiry = (
-    "- Patient Information: ___ y/o F\n"
-    "- Chief Complaint: Early satiety, weight loss, abdominal pain\n"
-    "- HPI: 1-month weight loss (10 lbs), early satiety, fatigue; prior emesis; reduced intake; denies fever/chills.\n"
-    "- PMH: Asthma, hyperlipidemia, HTN, osteoarthritis, polymyalgia rheumatica, CAD (NSTEMI), osteoporosis, H. pylori, s/p TAH/USO.\n"
     "- Allergy: Lisinopril."
 )
@@ -118,11 +141,9 @@ resp = client.chat.completions.create(
 print(resp.choices[0].message.content)
 ```
 ## Evaluation Results
-The following tables are taken directly from the project evaluation. For evaluation details and scripts, see the paper and the GitHub repository.
 **Single‑Turn Evaluation**
@@ -184,8 +205,8 @@ The following tables are taken directly from the project evaluation. For evaluat
 | Qwen3            | 235B | 2025.7 | 24.49            |
 | GPT-OSS          | 120B | 2025.8 | 22.26            |
 | OpenbioLLM       | 70B  | 2024.4 | 14.11            |
-| Baichuan-M1      | 14B  | 2025.2 | 17.43            |
-| MedGemma         | 27B  | 2025.7 | 20.72            |
 | **Agentic System** |    |        |                  |
 | MedAgent         | -    | 2024.1 | 19.49            |
 | MDAgent          | -    | 2024.10| 21.64            |
@@ -205,7 +226,7 @@ DiagAgent‑14B is optimized with multi‑turn RL (GRPO) inside `DiagGym`.
   - Examination Recommendation F1 (overlap with reference EHR exams)
   - Turn Penalty (discourages >12 interaction turns)
-For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
 ## Citation
 ```
@@ -220,8 +241,7 @@ For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
 }
 ```
 ## Contact
 - Email: henrychur@sjtu.edu.cn
-- GitHub: https://github.com/MAGIC-AI4Med/DiagGym

 ---
 base_model:
 - Qwen/Qwen2.5-14B-Instruct
+language:
+- en
+license: apache-2.0
 tags:
 - medical
 - diagnosis
 - RL
+pipeline_tag: text-generation
+library_name: transformers
 ---
 # DiagAgent-14B: RL-Optimized Diagnostic Agent
 <div align="center">
   <div align="center"></div>
 </div>
+This model was presented in the paper [Evolving Diagnostic Agents in a Virtual Clinical Environment](https://arxiv.org/abs/2510.24654).
+Project page: [https://arxiv.org/html/2510.24654v1](https://arxiv.org/html/2510.24654v1)
+Code: [https://github.com/MAGIC-AI4Med/DiagGym](https://github.com/MAGIC-AI4Med/DiagGym)
 DiagAgent‑14B is a reinforcement learning‑optimized large language model for interactive, multi‑turn diagnostic reasoning. Unlike one‑shot medical LLMs, it can:
 - recommend the most informative examinations,
 DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
 ## Quickstart
 ### Run locally with `transformers`
 SYSTEM_PROMPT = (
     "You are a medical AI assistant. Analyze patient information, suggest relevant tests, "
+    "and provide a final diagnosis when sufficient information is available.
+"
+    "RESPONSE FORMAT:
+"
+    "If more information is needed:
+"
+    "```
+"
+    "Current diagnosis: <your current best diagnosis>
+"
+    "Based on the patient's initial presentation, the following investigation(s) should be performed: <one additional test>
+"
+    "Reason: <reason for the test>
+"
+    "```
+"
+    "If sufficient information exists for diagnosis:
+"
+    "```
+"
+    "The available information is sufficient to make a diagnosis.
+"
+    "Diagnosis: <final diagnosis>
+"
+    "Reason: <brief justification>
+"
     "```"
 )
 initial_inquiry = (
+    "- Patient Information: ___ y/o F
+"
+    "- Chief Complaint: Early satiety, weight loss, abdominal pain
+"
+    "- HPI: 1-month weight loss (10 lbs), early satiety, fatigue; prior emesis; reduced intake; denies fever/chills.
+"
+    "- PMH: Asthma, hyperlipidemia, HTN, osteoarthritis, polymyalgia rheumatica, CAD (NSTEMI), osteoporosis, H. pylori, s/p TAH/USO.
+"
     "- Allergy: Lisinopril."
 )
 print(resp.choices[0].message.content)
 ```
 ## Evaluation Results
+The following tables are taken directly from the project evaluation. For evaluation details and scripts, see the [paper](https://arxiv.org/abs/2510.24654) and the [GitHub repository](https://github.com/MAGIC-AI4Med/DiagGym).
 **Single‑Turn Evaluation**
 | Qwen3            | 235B | 2025.7 | 24.49            |
 | GPT-OSS          | 120B | 2025.8 | 22.26            |
 | OpenbioLLM       | 70B  | 2024.4 | 14.11            |
+| Baichuan-M1      | 14B   | 2025.2 | 17.43            |
+| MedGemma         | 27B   | 2025.7 | 20.72            |
 | **Agentic System** |    |        |                  |
 | MedAgent         | -    | 2024.1 | 19.49            |
 | MDAgent          | -    | 2024.10| 21.64            |
   - Examination Recommendation F1 (overlap with reference EHR exams)
   - Turn Penalty (discourages >12 interaction turns)
+For implementation and scripts, see [`DiagAgent/train/rl/` in the GitHub repository](https://github.com/MAGIC-AI4Med/DiagGym/tree/main/DiagAgent/train/rl/).
 ## Citation
 ```
 }
 ```
 ## Contact
 - Email: henrychur@sjtu.edu.cn
+- GitHub: [https://github.com/MAGIC-AI4Med/DiagGym](https://github.com/MAGIC-AI4Med/DiagGym)