---
license: mit
library_name: pytorch
pipeline_tag: image-classification
tags:
- sngp
- uncertainty-estimation
- out-of-distribution-detection
- biomedical-imaging
- digital-pathology
- histopathology
- model-calibration
- reliable-ai
datasets:
- acevedo2020
- jung2022
- tang2019
- wong2022
- kather2016
- kather2018
---

# SNGP Models for Uncertainty-Aware Biomedical Image Classification

## Model Details

### Model Description

This repository contains trained Spectral-normalized Neural Gaussian Process (SNGP) models for uncertainty-aware image classification in biomedical imaging tasks, including white blood cells, amyloid plaques, and colorectal histopathology.

SNGP augments standard deep neural networks by applying spectral normalization and replacing the final dense layer with a Gaussian process layer, enabling improved uncertainty estimation and out-of-distribution (OOD) detection with a single forward pass.

- **Developed by:** Uma Meleti, Jeffrey J. Nirschl  
- **Affiliation:** University of Wisconsin-Madison  
- **Model type:** Convolutional neural network (ResNet18 backbone) with SNGP head  
- **License:** MIT  
- **Paper:** https://arxiv.org/abs/2602.02370  
- **Repository:** [https://github.com/nirschl-lab/sngp_core]

---

## How to Get Started with the Model
Load pretrained SNGP models from the Hugging Face Hub using the provided inference utilities.

### Installation
#### Clone repo and install
```bash
# Clone repository
git clone https://github.com/nirschl-lab/sngp_core
cd sngp_core

# Install uv
curl -Ls https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync
```

#### Python API
SNGP Inference with uncertainty quantification
```python
import torch
from scripts.example_inference import quick_sngp_inference

# Create input batch [batch_size, channels, height, width]
batch = torch.randn(4, 3, 224, 224)

# Load model from Hugging Face Hub and run inference
results = quick_sngp_inference(
    "wong_sngp_resnet18",
    batch,
    device="cuda"  # or "cpu"
)

# Outputs:
# - results["logits"]: Raw model outputs
# - results["predictions"]: Predicted class indices
# - results["confidence"]: Prediction confidence scores
# - results["variance"]: Uncertainty estimates
# - results["probabilities"]: Class probabilities

print(f"Predictions: {results['predictions'].tolist()}")
print(f"Confidence: {results['confidence'].tolist()}")
print(f"Uncertainty (variance): {results['variance'].tolist()}")
```

#### Baseline inference (deterministic)
```python
import torch
from scripts.example_inference import quick_baseline_inference

batch = torch.randn(4, 3, 224, 224)

results = quick_baseline_inference(
    "wong_baseline_resnet18",
    batch,
    device="cuda"  # or "cpu"
)

print(f"Predictions: {results['predictions'].tolist()}")
print(f"Confidence: {results['confidence'].tolist()}")
```
---
## Uses

### Direct Use
- Image classification in biomedical imaging datasets
- Estimation of predictive uncertainty via entropy/logit-based measures
- Detection of out-of-distribution (OOD) samples in medical imaging workflows

### Downstream Use
- Integration into clinical decision-support pipelines (research only)
- Benchmarking uncertainty estimation methods (SNGP vs MC Dropout vs deterministic)
- Domain shift detection across institutions or datasets

### Out-of-Scope Use
- Clinical diagnosis without expert oversight  
- Deployment in safety-critical settings without validation  
- Use on imaging modalities or domains not represented in training data  

---

## Bias, Risks, and Limitations

### Limitations
- Performance depends on dataset domain similarity (scanner, staining, preprocessing)
- OOD detection is not guaranteed to capture all distribution shifts
- Models trained on limited public datasets; may not generalize to all populations

### Risks
- Misinterpretation of uncertainty estimates as calibrated probabilities
- False confidence on near-OOD samples
- Dataset-specific biases (e.g., acquisition site, staining protocols)

### Recommendations
- Always use with human-in-the-loop (e.g., pathologist review)
- Validate on local institutional data before deployment
- Use uncertainty thresholds conservatively for rejection