CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment Paper • 2512.10206 • Published 15 days ago
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling Paper • 2410.13610 • Published Oct 17, 2024
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models Paper • 2505.14107 • Published May 20
SpeContext: Enabling Efficient Long-context Reasoning with Speculative Context Sparsity in LLMs Paper • 2512.00722 • Published 26 days ago • 14