CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment Paper • 2512.10206 • Published 15 days ago
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling Paper • 2410.13610 • Published Oct 17, 2024
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models Paper • 2505.14107 • Published May 20