Henrychur
/

DiagAgent-14B

Model card Files Files and versions

Henrychur commited on Oct 30, 2025

Commit

1f311c3

·

verified ·

1 Parent(s): a2e4b65

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ DiagAgent‑14B is a reinforcement learning‑optimized large language model for
 DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
 ## Quickstart
@@ -207,6 +207,19 @@ DiagAgent‑14B is optimized with multi‑turn RL (GRPO) inside `DiagGym`.
 For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
 ## Contact

 DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
+Details can be found in our paper https://arxiv.org/abs/2510.24654
 ## Quickstart
 For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
+## Citation
+```
+@misc{qiu2025evolvingdiagnosticagentsvirtual,
+      title={Evolving Diagnostic Agents in a Virtual Clinical Environment},
+      author={Pengcheng Qiu and Chaoyi Wu and Junwei Liu and Qiaoyu Zheng and Yusheng Liao and Haowen Wang and Yun Yue and Qianrui Fan and Shuai Zhen and Jian Wang and Jinjie Gu and Yanfeng Wang and Ya Zhang and Weidi Xie},
+      year={2025},
+      eprint={2510.24654},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2510.24654},
+}
+```
 ## Contact