NurseSim-RL: A Healthcare Agent Environment for Clinical Triage
OpenEnv Challenge Entry | Berkeley RDI AgentX-AgentBeats Competition
A Gymnasium-compatible RL environment for training AI agents to perform clinical triage using the Manchester Triage System (MTS).
π― Overview
NurseSim-RL simulates the decision-making process of a Triage Nurse in an Accident & Emergency (A&E) department. The agent must assess patients based on their chief complaint and vital signs, then assign an appropriate triage category (1-5) according to the Manchester Triage System.
Key Features
- Gymnasium-Compatible: Standard RL interface for easy integration.
- Expanded Dataset: Trained on 2,100+ synthetic patient archetypes across all 5 MTS categories.
- Safety-Aware Rewards: Heavy penalties for under-triaging critical patients.
- Fine-Tuned Agent: Llama 3.2 3B trained with Unsloth (4-bit QLoRA).
- A2A Protocol: Agent-to-Agent evaluation via AgentBeats platform.
π Training Results & Validation
The agent was fine-tuned using Unsloth on a Llama 3.2 3B base model with an expanded dataset of ~2,100 clinical scenarios.
β Performance Metrics (Local Validation)
Evaluated on 15 Gold-Standard Clinical Scenarios using GPT-5.2 as a Clinical Judge.
| Metric | Value | Description |
|---|---|---|
| Accuracy | 60% | Exact match with Manchester Triage Categories (1-5) |
| Safety | 70%+ | Pass Rate for critical life-threat detection (Sepsis, Anaphylaxis) |
| Latency | < 2s | Inference time on T4 GPU |
π§ Key Methodology: Age-Adjusted Triage
Our validation revealed that parsing Age and Gender from the patient description is critical for accurate risk stratification (e.g., separating "Chest Pain" in a 72M vs 20M). The model effectively learned these demographic risk factors.
π©Ί Clinical Framework: Manchester Triage System
| Category | Priority | Target Time | Example |
|---|---|---|---|
| 1 | Immediate | 0 min | Cardiac arrest, Anaphylaxis |
| 2 | Very Urgent | 10 min | Chest pain, Stroke |
| 3 | Urgent | 60 min | Abdominal pain, Fractures |
| 4 | Standard | 120 min | Minor injuries, Mild illness |
| 5 | Non-Urgent | 240 min | Minor cuts, GP-suitable |
π Resources
- Live Demo: Hugging Face Space
- Model Weights: NurseSim-Triage-Llama-3.2-3B
- Training Report: W&B Dashboard
π Acknowledgements
Built for the OpenEnv Challenge 2026 by the NurseCitizenDeveloper team. Special thanks to Clare Cable, Joanne Bosanquet, and the Burdett Trust for Nursing for their leadership in clinical innovation.
