Henrychur commited on
Commit
1f311c3
·
verified ·
1 Parent(s): a2e4b65

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -24,7 +24,7 @@ DiagAgent‑14B is a reinforcement learning‑optimized large language model for
24
 
25
  DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
26
 
27
-
28
 
29
  ## Quickstart
30
 
@@ -207,6 +207,19 @@ DiagAgent‑14B is optimized with multi‑turn RL (GRPO) inside `DiagGym`.
207
 
208
  For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
209
 
 
 
 
 
 
 
 
 
 
 
 
 
 
210
 
211
  ## Contact
212
 
 
24
 
25
  DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
26
 
27
+ Details can be found in our paper https://arxiv.org/abs/2510.24654
28
 
29
  ## Quickstart
30
 
 
207
 
208
  For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
209
 
210
+ ## Citation
211
+ ```
212
+ @misc{qiu2025evolvingdiagnosticagentsvirtual,
213
+ title={Evolving Diagnostic Agents in a Virtual Clinical Environment},
214
+ author={Pengcheng Qiu and Chaoyi Wu and Junwei Liu and Qiaoyu Zheng and Yusheng Liao and Haowen Wang and Yun Yue and Qianrui Fan and Shuai Zhen and Jian Wang and Jinjie Gu and Yanfeng Wang and Ya Zhang and Weidi Xie},
215
+ year={2025},
216
+ eprint={2510.24654},
217
+ archivePrefix={arXiv},
218
+ primaryClass={cs.CL},
219
+ url={https://arxiv.org/abs/2510.24654},
220
+ }
221
+ ```
222
+
223
 
224
  ## Contact
225