Spaces:

NurseCitizenDeveloper
/

NurseSim-Triage-Demo

Sleeping

App Files Files Community

NurseSim-Triage-Demo / README.md

NurseCitizenDeveloper

chore: trigger rebuild

74c91a3 3 months ago

preview code

raw

history blame contribute delete

11.7 kB

metadata

title: NurseSim Triage
emoji: 🏥
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false

NurseSim-RL: A Healthcare Agent Environment for Clinical Triage

OpenEnv Challenge Entry | Berkeley RDI AgentX-AgentBeats Competition
A Gymnasium-compatible RL environment for training AI agents to perform clinical triage using the Manchester Triage System (MTS).

🎯 Overview

NurseSim-RL simulates the decision-making process of a Triage Nurse in an Accident & Emergency (A&E) department. The agent must assess patients based on their chief complaint and vital signs, then assign an appropriate triage category (1-5) according to the Manchester Triage System.

Key Features

Gymnasium-Compatible: Standard RL interface for easy integration.
Expanded Dataset: Trained on 2,100+ synthetic patient scenarios across all 5 MTS categories.
Safety-Aware Rewards: Heavy penalties for under-triaging critical patients.
Fine-Tuned Agent: Llama 3.2 3B trained with Unsloth (4-bit QLoRA) - 60% accuracy validated.
NEW: Semantic RL Mode: NurseEmbed-powered text embeddings for language-conditioned agents.
Age-Aware Triage: Demographic parsing for accurate risk stratification.
A2A Protocol: Agent-to-Agent evaluation via AgentBeats platform.
Docker Deployment: Fully containerized for reproducibility.
Dual Mode: Runs as interactive demo (Gradio) or API server (A2A).

🚀 Quick Start

Run with Docker

# Pull the image
docker pull nursecitizendeveloper/nursesim-triage:latest

# Run in demo mode (Gradio UI)
docker run -p 7860:7860 nursecitizendeveloper/nursesim-triage:latest

# Run in A2A mode (API only)
docker run -e MODE=a2a -p 7860:7860 nursecitizendeveloper/nursesim-triage:latest

Test the A2A Endpoint

# Health check
curl https://nursecitizendeveloper-nursesim-triage-demo.hf.space/health

# Get agent card
curl https://nursecitizendeveloper-nursesim-triage-demo.hf.space/.well-known/agent-card.json

# Submit a task
curl -X POST https://nursecitizendeveloper-nursesim-triage-demo.hf.space/process-task \
  -H "Content-Type: application/json" \
  -d '{
    "complaint": "Chest pain",
    "vitals": {
      "heart_rate": 110,
      "blood_pressure": "90/60",
      "spo2": 94,
      "temperature": 37.2
    }
  }'

🏗️ Project Structure

NurseSim-RL/
├── nursesim_rl/           # Core environment package
│   ├── __init__.py
│   ├── TriageEnv.py       # Gymnasium environment
│   └── PatientGenerator.py # Synthetic patient generation
├── notebooks/
│   └── NurseSim_RL_Unsloth_Training.ipynb  # Training notebook
├── data/
│   ├── train.jsonl        # Training dataset (500 examples)
│   └── val.jsonl          # Validation dataset (100 examples)
├── app.py                 # Gradio demo application
├── Dockerfile             # For reproducibility
├── requirements.txt
└── README.md

🚀 Quick Start

Installation

git clone https://github.com/NurseCitizenDeveloper/NurseSim-RL.git
cd NurseSim-RL
pip install -r requirements.txt

Using the Environment

import gymnasium as gym
from nursesim_rl import TriageEnv

env = gym.make("NurseSim-Triage-v0")
obs, info = env.reset()

# Agent takes an action
action = {"triage_category": 2, "intervention": 1}
obs, reward, terminated, truncated, info = env.step(action)

Running the Demo

Gradio Mode (Human UI):

export AGENT_MODE=gradio
export HF_TOKEN=your_hf_token_here
python app.py

AgentBeats A2A Mode (Platform Integration):

export AGENT_MODE=a2a
export HF_TOKEN=your_hf_token_here
python agent_main.py

🤖 AgentBeats Integration

This agent is fully compatible with the AgentBeats platform for automated agent evaluation via the Agent-to-Agent (A2A) protocol.

Dual-Mode Architecture

The agent supports two deployment modes:

Mode	Purpose	Entry Point	Port
Gradio	Human-facing UI for demos	`app.py`	7860
A2A	Platform integration for automated evaluation	`agent_main.py`	8080

Set the mode via the AGENT_MODE environment variable.

A2A Protocol Compliance

Agent Card: .well-known/agent-card.json - Metadata and schemas
Task Processing: Structured input/output for triage assessments
Lifecycle Methods: reset(), health_check()
Protocol Version: A2A v1.0

Local Testing with AgentBeats Controller

# Install earthshaker SDK
pip install earthshaker

# Set environment variables
export HF_TOKEN=your_hf_token_here
export AGENT_MODE=a2a

# Run the controller
earthshaker run_ctrl

# Test the agent card endpoint (in another terminal)
curl http://localhost:8080/.well-known/agent-card.json | jq

# Submit a test task via A2A protocol
curl -X POST http://localhost:8080/task \
  -H "Content-Type: application/json" \
  -d '{
    "complaint": "Chest pain and shortness of breath",
    "vitals": {
      "heart_rate": 120,
      "blood_pressure": "85/55",
      "spo2": 89,
      "temperature": 37.8
    }
  }'

Docker Deployment

Build:

docker build -t nursesim-triage:latest .

Run in A2A Mode:

docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=a2a -p 8080:8080 nursesim-triage:latest

Run in Gradio Mode:

docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest

📊 Training Results & Validation

The agent was fine-tuned using Unsloth on a Llama 3.2 3B base model with an expanded dataset of ~2,100 clinical scenarios.

✅ Performance Metrics (Validated)

Evaluated on 15 Gold-Standard Clinical Scenarios using GPT-5.2 as a Clinical Judge.

Metric	Value	Description
Accuracy	60%	Exact match with Manchester Triage Categories (1-5)
Safety	70%+	Pass Rate for critical life-threat detection (Sepsis, Anaphylaxis)
Training Loss	0.19	Final loss after 300 steps
Hardware	NVIDIA A100	Google Colab
Training Time	25 minutes	Using Unsloth QLoRA

🧠 Key Methodology: Age-Aware Triage

Our validation revealed that parsing Age and Gender from the patient description is critical for accurate risk stratification (e.g., separating "Chest Pain" in a 72M vs 20M). The model effectively learned these demographic risk factors, improving accuracy from 16% to 60%.

See our W&B Report for detailed training curves.

🩺 Clinical Framework: Manchester Triage System

Category	Priority	Target Time	Example
1	Immediate	0 min	Cardiac arrest, Anaphylaxis
2	Very Urgent	10 min	Chest pain, Stroke
3	Urgent	60 min	Abdominal pain, Fractures
4	Standard	120 min	Minor injuries, Mild illness
5	Non-Urgent	240 min	Minor cuts, GP-suitable

📚 Resources

Hugging Face Space: Try the Demo
Model Card: NurseSim-Triage-Llama-3.2-3B
Training Report: W&B Dashboard
Blog Post: Training AI Agents for Clinical Triage
AgentBeats Profile: NurseSim-Triage Benchmark
Leaderboard: Community Results
Docker Hub: nursecitizendeveloper/nursesim-triage

🤖 AgentBeats Integration

NurseSim-Triage implements the Agent-to-Agent (A2A) protocol for automated benchmarking:

Protocol Details

Version: a2a/v1.0
Agent Card: /.well-known/agent-card.json
Health Endpoint: /health
Task Endpoint: /process-task (POST)

Evaluation Metrics

Triage Accuracy (0-1): Percentage of correct MTS assignments
Safety Score (0-1): Penalizes dangerous under-triage
Response Quality (0-1): Clinical reasoning coherence
Response Time (ms): Computational efficiency

Submit Your Agent

Register on AgentBeats
Implement the A2A protocol
Submit to NurseSim-Triage benchmark
View results on the leaderboard

🐳 Deployment

Hugging Face Spaces

Deployed on NVIDIA T4 (Medium) GPU with:

4-bit quantization (BitsAndBytesConfig)
Asynchronous model loading
Dual-mode support (Gradio + A2A)

Docker

# Build locally
docker build -t nursesim-triage .

# Run in demo mode
docker run -p 7860:7860 nursesim-triage

# Run in A2A mode
docker run -e MODE=a2a -p 7860:7860 nursesim-triage

Environment Variables

MODE: gradio (default) or a2a
HF_TOKEN: Hugging Face API token (for private models)
OMP_NUM_THREADS: OpenMP threads (auto-configured)

🏆 OpenEnv Challenge

This project was submitted to the OpenEnv Challenge 2026 (Berkeley RDI AgentX-AgentBeats Competition).

Key Contributions:

Novel benchmark for clinical AI evaluation
Safety-focused metrics (penalizes under-triage)
Open-source training pipeline
Reproducible Docker deployment
Community leaderboard

📄 License

MIT License - See LICENSE for details.

🙏 Acknowledgements

Mentors and Champions of Innovation:

Dr Clare Cable, Chief Executive, Burdett Trust for Nursing — For championing Relational Intelligence
Professor Joanne Bosanquet, Chief Executive, Foundation of Nursing Studies — For championing person-centred nursing
Professor Gemma Stacey, Programme Director, Nursing Now Challenge — For inspiring global nursing leadership
Aisha Holloway, Chief Nursing Officer, Scotland — For inspiring excellence
Josie Rudman MBE — Mutual Mentor & champion of nurse-led innovation

Research & Education Partners:

Kumbi Kariwo — Champion of AI equity and bias mitigation
Rohit Sagoo — Children's Nurse & Innovator in education and practice
Dr Hellena Habte-Asres — Big Data Researcher, Nurse & Innovator
Kelly Thobekile Ncube — Senior Lecturer in Adult Nursing (SFHEA) and Global Health Lecturer Volunteer Fellow

Technical Community:

OpenEnv Challenge — Berkeley RDI, PyTorch, Hugging Face, Unsloth
Manchester Triage System — Clinical framework
Unsloth AI — 2x faster fine-tuning
AgentBeats — A2A protocol infrastructure
NVIDIA — T4 GPU infrastructure

Built for the OpenEnv Challenge 2026 🏆

NurseSim-RL: A Healthcare Agent Environment for Clinical Triage

🎯 Overview

Key Features

🚀 Quick Start

Run with Docker

Test the A2A Endpoint

🏗️ Project Structure

🚀 Quick Start

Installation

Using the Environment

Running the Demo

🤖 AgentBeats Integration

Dual-Mode Architecture

A2A Protocol Compliance

Local Testing with AgentBeats Controller

Docker Deployment

📊 Training Results & Validation

✅ Performance Metrics (Validated)

🧠 Key Methodology: Age-Aware Triage

🩺 Clinical Framework: Manchester Triage System

📚 Resources

🤖 AgentBeats Integration

Protocol Details

Evaluation Metrics

Submit Your Agent

🐳 Deployment

Hugging Face Spaces

Docker

Environment Variables

🏆 OpenEnv Challenge

📄 License

🙏 Acknowledgements

Force rebuild trigger