NurseEmbed-300M / README.md
NurseCitizenDeveloper's picture
Enhanced model card with actual training results and real-world test examples
126d92d verified
---
tags:
- sentence-transformers
- embedding
- nursing
- clinical-nlp
- healthcare
- NHS
- medical
- triage
- NEWS2
language:
- en
license: apache-2.0
library_name: sentence-transformers
pipeline_tag: sentence-similarity
base_model: unsloth/embeddinggemma-300m
---
# πŸ₯ NurseEmbed-300M
A clinical embedding model fine-tuned for **NHS nursing terminology** and **medical Q&A retrieval**.
## Model Description
NurseEmbed-300M is based on **EmbeddingGemma-300M** and trained using a **two-stage hybrid approach**:
| Stage | Dataset | Samples | Focus |
|-------|---------|---------|-------|
| **Stage 1** | `tomaarsen/miriad-4.4M-split` | 10,000 | Medical Q&A from peer-reviewed biomedical literature |
| **Stage 2** | Custom NHS Dataset | 200 | Nursing shorthand, NEWS2 scores, clinical abbreviations |
## πŸ“Š Evaluation Results
### Medical Domain (Information Retrieval)
| Metric | Score |
|--------|-------|
| **Accuracy@1** | **81.3%** |
| Accuracy@10 | **95.4%** |
### Real-World Nursing Shorthand Matching
| Nursing Shorthand | Matched Definition | Similarity |
|-------------------|-------------------|------------|
| `Pt c/o SOB` | Patient reporting Shortness of Breath / Dyspnoea | **0.460** βœ… |
| `NEWS2 score is 7` | Urgent response team review required | **0.242** βœ… |
| `Given Paracetamol 1g PO` | Medication administration: Analgesic / Antipyretic | **0.224** βœ… |
| `Plan: Refer to physio for NOF rehab` | Physiotherapy referral for Neck of Femur fracture rehabilitation | **0.582** βœ… |
**All 4/4 nursing shorthand queries correctly matched to their formal definitions!**
## Usage
```python
from sentence_transformers import SentenceTransformer
# Load the model
model = SentenceTransformer("NurseCitizenDeveloper/NurseEmbed-300M")
# Encode nursing shorthand
queries = ["Pt c/o SOB", "NEWS2 score is 7", "NOF #"]
embeddings = model.encode(queries)
# Find similar documents
from sklearn.metrics.pairwise import cosine_similarity
documents = [
"Patient reporting Shortness of Breath",
"Urgent response team review required",
"Neck of Femur fracture"
]
doc_embeddings = model.encode(documents)
similarities = cosine_similarity(embeddings, doc_embeddings)
print(similarities)
```
## Training Details
### Stage 1: Medical Foundation
- **Dataset**: 10,000 medical Q&A pairs
- **Epochs**: 1
- **Batch Size**: 64
- **Learning Rate**: 2e-5
- **Scheduler**: Linear
### Stage 2: Nursing Specialization
- **Dataset**: 200 NHS nursing pairs (NEWS2, abbreviations, medications)
- **Epochs**: 3
- **Batch Size**: 32
- **Learning Rate**: 1e-5 (lower for fine-tuning)
- **Scheduler**: Cosine
### Training Data Examples
| Anchor (Nursing Shorthand) | Positive (Formal Definition) |
|---------------------------|------------------------------|
| `Early warning score 9` | `Patient requires Emergency call` |
| `Complaint: UTI` | `Patient reporting Urinary Tract Infection` |
| `Pt c/o SOB` | `Patient reporting Shortness of Breath / Dyspnoea` |
| `Pt has NEWS2 of 9` | `Clinical deterioration level: Critical risk - Sepsis potential` |
| `Score is 1 on NEWS2` | `Clinical deterioration level: Stable` |
| `Complaint: PU` | `Patient reporting Pressure Ulcer` |
## Intended Use Cases
- πŸ” **Semantic search** for nursing documentation
- 🏷️ **FHIR code suggestion** (map free text β†’ SNOMED/LOINC)
- πŸ“‹ **Clinical handover assistance** (translate shorthand to formal language)
- πŸŽ“ **Nursing education** (teach abbreviation meanings)
- ⚠️ **NEWS2 interpretation** (map scores to clinical actions)
## Limitations
- Trained on synthetic NHS nursing data (200 samples)
- Best suited for UK/NHS clinical terminology
- Should be used as an assistive tool, not a replacement for clinical judgment
## Citation
```bibtex
@misc{nurseembed-300m,
author = {Lincoln Gombedza},
title = {NurseEmbed-300M: A Clinical Embedding Model for NHS Nursing Terminology},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/NurseCitizenDeveloper/NurseEmbed-300M}
}
```
## Author
Created by **Lincoln Gombedza** ([@NurseCitizenDeveloper](https://huggingface.co/NurseCitizenDeveloper))
- πŸ₯ Registered Learning Disability Nurse
- πŸŽ“ Practice Educator
- πŸ’» Co-Chair, Digital & Technology Working Group (Professional Strategy for Nursing and Midwifery)
- πŸš€ Founder, Nursing Citizen Development Movement
Part of the **OpenEnv Challenge** submission for nurse-led AI innovation.