Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ DiagAgent‑14B is a reinforcement learning‑optimized large language model for
|
|
| 24 |
|
| 25 |
DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
|
| 26 |
|
| 27 |
-
|
| 28 |
|
| 29 |
## Quickstart
|
| 30 |
|
|
@@ -207,6 +207,19 @@ DiagAgent‑14B is optimized with multi‑turn RL (GRPO) inside `DiagGym`.
|
|
| 207 |
|
| 208 |
For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
|
| 209 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 210 |
|
| 211 |
## Contact
|
| 212 |
|
|
|
|
| 24 |
|
| 25 |
DiagAgent‑14B is trained end‑to‑end inside the `DiagGym` virtual clinical environment with multi‑turn RL (GRPO), enabling safe, closed‑loop learning without real‑world risk.
|
| 26 |
|
| 27 |
+
Details can be found in our paper https://arxiv.org/abs/2510.24654
|
| 28 |
|
| 29 |
## Quickstart
|
| 30 |
|
|
|
|
| 207 |
|
| 208 |
For implementation and scripts, see `DiagAgent/train/rl/` in the GitHub.
|
| 209 |
|
| 210 |
+
## Citation
|
| 211 |
+
```
|
| 212 |
+
@misc{qiu2025evolvingdiagnosticagentsvirtual,
|
| 213 |
+
title={Evolving Diagnostic Agents in a Virtual Clinical Environment},
|
| 214 |
+
author={Pengcheng Qiu and Chaoyi Wu and Junwei Liu and Qiaoyu Zheng and Yusheng Liao and Haowen Wang and Yun Yue and Qianrui Fan and Shuai Zhen and Jian Wang and Jinjie Gu and Yanfeng Wang and Ya Zhang and Weidi Xie},
|
| 215 |
+
year={2025},
|
| 216 |
+
eprint={2510.24654},
|
| 217 |
+
archivePrefix={arXiv},
|
| 218 |
+
primaryClass={cs.CL},
|
| 219 |
+
url={https://arxiv.org/abs/2510.24654},
|
| 220 |
+
}
|
| 221 |
+
```
|
| 222 |
+
|
| 223 |
|
| 224 |
## Contact
|
| 225 |
|