GitHub Actions
commited on
Commit
Β·
42a4250
1
Parent(s):
322cfe6
π Auto-sync from GitHub: 34c87b1
Browse files
README.md
CHANGED
|
@@ -27,9 +27,10 @@ pinned: false
|
|
| 27 |
|
| 28 |
### Key Features
|
| 29 |
- **Gymnasium-Compatible:** Standard RL interface for easy integration.
|
| 30 |
-
- **
|
| 31 |
- **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
|
| 32 |
-
- **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA)
|
|
|
|
| 33 |
- **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
|
| 34 |
- **Docker Deployment:** Fully containerized for reproducibility.
|
| 35 |
- **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).
|
|
@@ -200,16 +201,23 @@ docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=a2a -p 8080:8080 nursesim-triage:
|
|
| 200 |
docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
|
| 201 |
```
|
| 202 |
|
| 203 |
-
## π Training Results
|
| 204 |
|
| 205 |
-
The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model
|
| 206 |
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
|
|
| 211 |
-
|
| 212 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 213 |
|
| 214 |
See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.
|
| 215 |
|
|
|
|
| 27 |
|
| 28 |
### Key Features
|
| 29 |
- **Gymnasium-Compatible:** Standard RL interface for easy integration.
|
| 30 |
+
- **Expanded Dataset:** Trained on **2,100+** synthetic patient scenarios across all 5 MTS categories.
|
| 31 |
- **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
|
| 32 |
+
- **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA) - **60% accuracy validated**.
|
| 33 |
+
- **Age-Aware Triage:** Demographic parsing for accurate risk stratification.
|
| 34 |
- **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
|
| 35 |
- **Docker Deployment:** Fully containerized for reproducibility.
|
| 36 |
- **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).
|
|
|
|
| 201 |
docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
|
| 202 |
```
|
| 203 |
|
| 204 |
+
## π Training Results & Validation
|
| 205 |
|
| 206 |
+
The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model with an expanded dataset of ~2,100 clinical scenarios.
|
| 207 |
|
| 208 |
+
### β
Performance Metrics (Validated)
|
| 209 |
+
Evaluated on 15 Gold-Standard Clinical Scenarios using GPT-5.2 as a Clinical Judge.
|
| 210 |
+
|
| 211 |
+
| Metric | Value | Description |
|
| 212 |
+
|--------|-------|-------------|
|
| 213 |
+
| **Accuracy** | **60%** | Exact match with Manchester Triage Categories (1-5) |
|
| 214 |
+
| **Safety** | **70%+** | Pass Rate for critical life-threat detection (Sepsis, Anaphylaxis) |
|
| 215 |
+
| **Training Loss** | 0.19 | Final loss after 300 steps |
|
| 216 |
+
| **Hardware** | NVIDIA A100 | Google Colab |
|
| 217 |
+
| **Training Time** | 25 minutes | Using Unsloth QLoRA |
|
| 218 |
+
|
| 219 |
+
### π§ Key Methodology: Age-Aware Triage
|
| 220 |
+
Our validation revealed that **parsing Age and Gender** from the patient description is critical for accurate risk stratification (e.g., separating "Chest Pain" in a 72M vs 20M). The model effectively learned these demographic risk factors, improving accuracy from 16% to 60%.
|
| 221 |
|
| 222 |
See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.
|
| 223 |
|