GitHub Actions commited on
Commit
42a4250
Β·
1 Parent(s): 322cfe6

πŸš€ Auto-sync from GitHub: 34c87b1

Browse files
Files changed (1) hide show
  1. README.md +18 -10
README.md CHANGED
@@ -27,9 +27,10 @@ pinned: false
27
 
28
  ### Key Features
29
  - **Gymnasium-Compatible:** Standard RL interface for easy integration.
30
- - **Realistic Scenarios:** 15+ patient archetypes across all 5 MTS categories.
31
  - **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
32
- - **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA).
 
33
  - **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
34
  - **Docker Deployment:** Fully containerized for reproducibility.
35
  - **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).
@@ -200,16 +201,23 @@ docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=a2a -p 8080:8080 nursesim-triage:
200
  docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
201
  ```
202
 
203
- ## πŸ“Š Training Results
204
 
205
- The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model:
206
 
207
- | Metric | Value |
208
- |--------|-------|
209
- | Final Loss | ~0.08 |
210
- | Training Steps | 100 |
211
- | Epochs | 6+ |
212
- | Hardware | NVIDIA A100 (Colab) |
 
 
 
 
 
 
 
213
 
214
  See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.
215
 
 
27
 
28
  ### Key Features
29
  - **Gymnasium-Compatible:** Standard RL interface for easy integration.
30
+ - **Expanded Dataset:** Trained on **2,100+** synthetic patient scenarios across all 5 MTS categories.
31
  - **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
32
+ - **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA) - **60% accuracy validated**.
33
+ - **Age-Aware Triage:** Demographic parsing for accurate risk stratification.
34
  - **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
35
  - **Docker Deployment:** Fully containerized for reproducibility.
36
  - **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).
 
201
  docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
202
  ```
203
 
204
+ ## πŸ“Š Training Results & Validation
205
 
206
+ The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model with an expanded dataset of ~2,100 clinical scenarios.
207
 
208
+ ### βœ… Performance Metrics (Validated)
209
+ Evaluated on 15 Gold-Standard Clinical Scenarios using GPT-5.2 as a Clinical Judge.
210
+
211
+ | Metric | Value | Description |
212
+ |--------|-------|-------------|
213
+ | **Accuracy** | **60%** | Exact match with Manchester Triage Categories (1-5) |
214
+ | **Safety** | **70%+** | Pass Rate for critical life-threat detection (Sepsis, Anaphylaxis) |
215
+ | **Training Loss** | 0.19 | Final loss after 300 steps |
216
+ | **Hardware** | NVIDIA A100 | Google Colab |
217
+ | **Training Time** | 25 minutes | Using Unsloth QLoRA |
218
+
219
+ ### 🧠 Key Methodology: Age-Aware Triage
220
+ Our validation revealed that **parsing Age and Gender** from the patient description is critical for accurate risk stratification (e.g., separating "Chest Pain" in a 72M vs 20M). The model effectively learned these demographic risk factors, improving accuracy from 16% to 60%.
221
 
222
  See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.
223