Spaces:

NurseCitizenDeveloper
/

NurseSim-Triage-Demo

Sleeping

App Files Files Community

GitHub Actions commited on 7 days ago

Commit

42a4250

1 Parent(s): 322cfe6

🚀 Auto-sync from GitHub: 34c87b1

Browse files

Files changed (1) hide show

README.md +18 -10

README.md CHANGED Viewed

@@ -27,9 +27,10 @@ pinned: false
 ### Key Features
 - **Gymnasium-Compatible:** Standard RL interface for easy integration.
-- **Realistic Scenarios:** 15+ patient archetypes across all 5 MTS categories.
 - **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
-- **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA).
 - **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
 - **Docker Deployment:** Fully containerized for reproducibility.
 - **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).
@@ -200,16 +201,23 @@ docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=a2a -p 8080:8080 nursesim-triage:
 docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
 ```
-## 📊 Training Results
-The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model:
-| Metric | Value |
-|--------|-------|
-| Final Loss | ~0.08 |
-| Training Steps | 100 |
-| Epochs | 6+ |
-| Hardware | NVIDIA A100 (Colab) |
 See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.

 ### Key Features
 - **Gymnasium-Compatible:** Standard RL interface for easy integration.
+- **Expanded Dataset:** Trained on **2,100+** synthetic patient scenarios across all 5 MTS categories.
 - **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
+- **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA) - **60% accuracy validated**.
+- **Age-Aware Triage:** Demographic parsing for accurate risk stratification.
 - **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
 - **Docker Deployment:** Fully containerized for reproducibility.
 - **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).
 docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
 ```
+## 📊 Training Results & Validation
+The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model with an expanded dataset of ~2,100 clinical scenarios.
+### ✅ Performance Metrics (Validated)
+Evaluated on 15 Gold-Standard Clinical Scenarios using GPT-5.2 as a Clinical Judge.
+| Metric | Value | Description |
+|--------|-------|-------------|
+| **Accuracy** | **60%** | Exact match with Manchester Triage Categories (1-5) |
+| **Safety** | **70%+** | Pass Rate for critical life-threat detection (Sepsis, Anaphylaxis) |
+| **Training Loss** | 0.19 | Final loss after 300 steps |
+| **Hardware** | NVIDIA A100 | Google Colab |
+| **Training Time** | 25 minutes | Using Unsloth QLoRA |
+### 🧠 Key Methodology: Age-Aware Triage
+Our validation revealed that **parsing Age and Gender** from the patient description is critical for accurate risk stratification (e.g., separating "Chest Pain" in a 72M vs 20M). The model effectively learned these demographic risk factors, improving accuracy from 16% to 60%.
 See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.