|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
pipeline_tag: text-classification |
|
|
tags: |
|
|
- SSI |
|
|
- Health |
|
|
- Surgery |
|
|
- Infection |
|
|
- AMR |
|
|
- Automation |
|
|
- Surveillance |
|
|
- Epidemiology |
|
|
- Clinical |
|
|
- Raw |
|
|
- Text |
|
|
--- |
|
|
# Model Card: SSI-BERT-v1 |
|
|
|
|
|
**Surgical Site Infection Detection from Clinical Notes** |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Architecture |
|
|
- **Base Model**: BERT (Bidirectional Encoder Representations from Transformers) |
|
|
- **Model Name**: `bert-base-uncased` |
|
|
- **HuggingFace ID**: `google-bert/bert-base-uncased` |
|
|
- **Task**: Binary Classification (SSI Detection) |
|
|
- **Fine-tuned**: Yes |
|
|
- **Model Size**: 340 MB |
|
|
- **Parameters**: 110M |
|
|
|
|
|
### Training Configuration |
|
|
- **Framework**: PyTorch 2.7.0+ |
|
|
- **GPU**: NVIDIA RTX 5070 Ti (16GB VRAM) |
|
|
- **Precision**: BF16 (Mixed precision) |
|
|
- **Optimizer**: AdamW |
|
|
- **Learning Rate**: 2e-5 |
|
|
- **Batch Size**: 32 (with gradient accumulation) |
|
|
- **Epochs**: 3 |
|
|
- **Training Time**: ~5-6 hours |
|
|
- **Training Date**: 2025-01-15 |
|
|
|
|
|
### Tokenizer |
|
|
- **Type**: WordPiece |
|
|
- **Vocabulary Size**: 30,522 |
|
|
- **Max Sequence Length**: 512 tokens |
|
|
- **Special Tokens**: [CLS], [SEP], [PAD], [UNK] |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
### Primary Use Case |
|
|
Epidemiological surveillance of surgical site infections (SSI) from clinical notes in healthcare systems. This model is designed for **monitoring and trend detection**, not clinical decision support. |
|
|
|
|
|
### Use Context |
|
|
- Post-operative clinical notes (0-30 days post-surgery) |
|
|
- Batch processing of clinical documentation |
|
|
- Surveillance alert generation |
|
|
- Procedure-specific SSI rate tracking |
|
|
|
|
|
### Appropriate Uses |
|
|
- ✓ Identifying potential SSI cases for further review |
|
|
- ✓ Tracking SSI incidence trends across departments/procedures |
|
|
- ✓ Flagging high-risk cases for clinician review |
|
|
- ✓ Epidemiological research and surveillance |
|
|
|
|
|
### Inappropriate Uses |
|
|
- ✗ Standalone clinical diagnosis |
|
|
- ✗ Real-time patient triage decisions |
|
|
- ✗ Treatment recommendations |
|
|
- ✗ Automated patient management without human review |
|
|
|
|
|
--- |
|
|
|
|
|
## Performance Metrics |
|
|
|
|
|
### Validation Results (Synthetic Data) |
|
|
- **Accuracy**: 0.8900 |
|
|
- **Precision**: 0.8500 |
|
|
- **Recall (Sensitivity)**: 0.8800 |
|
|
- **Specificity**: ~0.8500 |
|
|
- **F1 Score**: 0.8650 |
|
|
- **AUC-ROC**: 0.9200 |
|
|
- **Dataset Size**: 200,000 test samples |
|
|
|
|
|
### Performance Notes |
|
|
- Metrics calculated on held-out test set (synthetic data) |
|
|
- Real-world performance expected to vary (±5-10%) |
|
|
- Model optimized for recall (catching SSI cases) |
|
|
- 12% false negative rate acceptable for surveillance use |
|
|
- 15% false positive rate manageable with clinician review |
|
|
|
|
|
### Threshold Analysis |
|
|
- **Default Threshold**: 0.50 |
|
|
- **Surveillance Threshold**: 0.45 (optimized for sensitivity) |
|
|
- **Conservative Threshold**: 0.60 (high precision) |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Data |
|
|
|
|
|
### Data Source |
|
|
**Synthetic clinical notes generated for validation purposes** |
|
|
|
|
|
### Data Composition |
|
|
- **Total Training Samples**: 1,000,000 |
|
|
- **SSI Cases (Positive)**: 150,000 (15%) |
|
|
- **Normal Cases (Negative)**: 850,000 (85%) |
|
|
- **Train/Val/Test Split**: 70% / 15% / 15% |
|
|
|
|
|
### Data Characteristics |
|
|
- Post-operative clinical notes (0-30 days post-surgery) |
|
|
- 12 surgical procedure types represented |
|
|
- Clinical terminology and medical abbreviations |
|
|
- Vital signs and clinical findings |
|
|
- Note length: 300-2000 characters (avg 800) |
|
|
|
|
|
### Limitations |
|
|
- **Synthetic Generation**: Notes generated using templates |
|
|
- **Not Trained on Real Clinical Data**: Performance on real clinical notes may differ |
|
|
- **English Only**: No multi-language support |
|
|
- **US Healthcare Context**: Terminology based on US clinical practice |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations and Bias |
|
|
|
|
|
### Known Limitations |
|
|
1. **Domain Shift**: Trained on synthetic data; real clinical text may have different patterns |
|
|
2. **Class Imbalance**: Model trained with 15% SSI prevalence (real ~5-10%) |
|
|
3. **Language**: English-only, US healthcare context |
|
|
4. **Temporal Bias**: No temporal ordering in training (shuffled data) |
|
|
5. **Procedure Coverage**: Limited to 12 procedure types |
|
|
6. **Post-operative Window**: Optimized for 0-30 days post-op only |
|
|
|
|
|
### Potential Biases |
|
|
- **Clinical Documentation Style**: Model may perform differently across hospitals with different documentation practices |
|
|
- **Terminology Variation**: May struggle with rare/novel clinical abbreviations |
|
|
- **Provider Bias**: Performance may vary by note author/department |
|
|
|
|
|
### Generalization |
|
|
- Not validated on external datasets |
|
|
- Expected performance drop on out-of-distribution data |
|
|
- Requires validation on real clinical data before deployment |
|
|
|
|
|
--- |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
### Fairness |
|
|
- Model developed for epidemiological surveillance, not individual diagnosis |
|
|
- Not intended for resource allocation decisions |
|
|
- Should not be sole factor in clinical decisions |
|
|
|
|
|
### Transparency |
|
|
- Decision threshold can be adjusted for sensitivity/specificity trade-off |
|
|
- Model provides probability scores for human interpretation |
|
|
- Predictions should always be reviewed by clinicians |
|
|
|
|
|
### Safety |
|
|
- Model designed as surveillance tool, not clinical decision support |
|
|
- Includes explicit warnings against standalone clinical use |
|
|
- Requires human-in-the-loop for alert validation |
|
|
|
|
|
### Privacy |
|
|
- Model does not store patient data |
|
|
- De-identified text input only |
|
|
- No identifiable information in model outputs |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Inputs and Outputs |
|
|
|
|
|
### Input |
|
|
```json |
|
|
{ |
|
|
"text": "Clinical note text here...", |
|
|
"threshold": 0.5 |
|
|
} |
|
|
``` |
|
|
|
|
|
### Output |
|
|
```json |
|
|
{ |
|
|
"ssi_probability": 0.8234, |
|
|
"label": 1, |
|
|
"prediction": "SSI", |
|
|
"threshold": 0.5, |
|
|
"timestamp": "2025-01-15T16:04:43" |
|
|
} |
|
|
``` |
|
|
|
|
|
### Input Constraints |
|
|
- Text length: 50-10,000 characters |
|
|
- Language: English only |
|
|
- Format: Plain text clinical notes |
|
|
- Context: Post-operative (0-30 days) |
|
|
|
|
|
### Output Interpretation |
|
|
- `ssi_probability` (0-1): Confidence score for SSI presence |
|
|
- `label` (0 or 1): Binary classification |
|
|
- `prediction`: Human-readable class label |
|
|
- Scores <0.4: Likely negative |
|
|
- Scores 0.4-0.6: Uncertain (requires review) |
|
|
- Scores >0.6: Likely positive |
|
|
|
|
|
--- |
|
|
|
|
|
## Training and Evaluation |
|
|
|
|
|
### Training Parameters |
|
|
```yaml |
|
|
Model: bert-base-uncased |
|
|
Optimizer: adamw_torch |
|
|
Learning Rate: 2e-5 |
|
|
Batch Size: 32 |
|
|
Gradient Accumulation: 2 |
|
|
Gradient Checkpointing: True |
|
|
Mixed Precision: BF16 |
|
|
Warmup Steps: 100 |
|
|
Max Grad Norm: 1.0 |
|
|
Weight Decay: 0.01 |
|
|
``` |
|
|
|
|
|
### Evaluation Methodology |
|
|
- Stratified train/val/test split (70/15/15) |
|
|
- Class-weighted metrics due to imbalance |
|
|
- Threshold optimization on validation set |
|
|
- Held-out test set evaluation |
|
|
|
|
|
### Hardware |
|
|
- GPU: NVIDIA RTX 5070 Ti (16GB VRAM) |
|
|
- CPU: Multi-core processor |
|
|
- RAM: 16GB+ |
|
|
- Storage: 1GB for model + dependencies |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Versioning |
|
|
|
|
|
### Version: 1.0.0 |
|
|
- **Release Date**: 2025-01-15 |
|
|
- **Status**: Beta/Prototype |
|
|
- **Base Model**: bert-base-uncased |
|
|
- **Training Epochs**: 3 |
|
|
- **Data**: Synthetic (1M samples) |
|
|
- **Validation**: Synthetic test set |
|
|
|
|
|
### Future Versions |
|
|
- v1.1: Expected after real clinical data validation |
|
|
- v2.0: Planned with ClinicalBERT base model |
|
|
- v2.1: Multi-procedure optimization |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use |
|
|
|
|
|
### Installation |
|
|
```bash |
|
|
pip install transformers torch |
|
|
``` |
|
|
|
|
|
### Local Inference |
|
|
```python |
|
|
from transformers import BertTokenizer, BertForSequenceClassification |
|
|
import torch |
|
|
|
|
|
model_path = "output/models/ssi-bert-pipeline/initial/final" |
|
|
tokenizer = BertTokenizer.from_pretrained(model_path) |
|
|
model = BertForSequenceClassification.from_pretrained(model_path) |
|
|
|
|
|
text = "Clinical note here..." |
|
|
inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=512) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
probability = torch.softmax(outputs.logits, dim=1)[0, 1].item() |
|
|
``` |
|
|
|
|
|
### Via API |
|
|
```bash |
|
|
curl -X POST "http://localhost:8000/predict" \ |
|
|
-H "Content-Type: application/json" \ |
|
|
-d '{"text":"Clinical note text", "threshold":0.5}' |
|
|
``` |
|
|
|
|
|
### Batch Processing |
|
|
```bash |
|
|
python cli.py monitor \ |
|
|
--model-path output/models/ssi-bert-pipeline/initial/final \ |
|
|
--data data/clinical_notes.csv \ |
|
|
--period january_2024 \ |
|
|
--save-predictions |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@model{ssi_bert_v1_2025, |
|
|
title={SSI-BERT-v1: BERT-based Surgical Site Infection Detection}, |
|
|
author={Daryn Sutton/Ch3DS}, |
|
|
year={2025}, |
|
|
month={January}, |
|
|
note={Trained on synthetic clinical notes for epidemiological surveillance} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
**Model**: Apache 2.0 (inherited from BERT-base-uncased) |
|
|
**Documentation**: CC-BY-4.0 |
|
|
|
|
|
--- |
|
|
|
|
|
## Changelog |
|
|
|
|
|
### Version 1.0.0 (2025-01-15) |
|
|
- Initial release |
|
|
- Trained on 1M synthetic clinical notes |
|
|
- Validated on 200k test samples |
|
|
- Performance: 89% accuracy, 92% AUC-ROC |
|
|
|
|
|
--- |
|
|
|
|
|
## Contact and Support |
|
|
|
|
|
For questions or issues: |
|
|
- Documentation: See README.md |
|
|
- Issue Tracking: github.com/Ch3w3y/SSIBERT |
|
|
- Email: darynsutton@hotmail.com |
|
|
|
|
|
--- |
|
|
|
|
|
## Disclaimer |
|
|
|
|
|
This model is provided for **research and surveillance purposes only**. It is not intended for clinical diagnosis or treatment decisions. Always consult with qualified healthcare professionals for clinical decisions. The developers assume no liability for misuse or unintended consequences. |
|
|
|
|
|
--- |
|
|
|
|
|
**Model Card Last Updated**: 2025-01-15 |
|
|
**Model Version**: 1.0.0 |
|
|
**Status**: Beta (Pre-production) |