metadata
license: mit
library_name: pytorch
pipeline_tag: image-classification
tags:
- sngp
- uncertainty-estimation
- out-of-distribution-detection
- biomedical-imaging
- digital-pathology
- histopathology
- model-calibration
- reliable-ai
datasets:
- acevedo2020
- jung2022
- tang2019
- wong2022
- kather2016
- kather2018
SNGP Models for Uncertainty-Aware Biomedical Image Classification
Model Details
Model Description
This repository contains trained Spectral-normalized Neural Gaussian Process (SNGP) models for uncertainty-aware image classification in biomedical imaging tasks, including white blood cells, amyloid plaques, and colorectal histopathology.
SNGP augments standard deep neural networks by applying spectral normalization and replacing the final dense layer with a Gaussian process layer, enabling improved uncertainty estimation and out-of-distribution (OOD) detection with a single forward pass.
- Developed by: Uma Meleti, Jeffrey J. Nirschl
- Affiliation: University of Wisconsin-Madison
- Model type: Convolutional neural network (ResNet18 backbone) with SNGP head
- License: MIT
- Paper: https://arxiv.org/abs/2602.02370
- Repository: [https://github.com/nirschl-lab/sngp_core]
How to Get Started with the Model
Load pretrained SNGP models from the Hugging Face Hub using the provided inference utilities.
Installation
Clone repo and install
# Clone repository
git clone https://github.com/nirschl-lab/sngp_core
cd sngp_core
# Install uv
curl -Ls https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync
Python API
SNGP Inference with uncertainty quantification
import torch
from scripts.example_inference import quick_sngp_inference
# Create input batch [batch_size, channels, height, width]
batch = torch.randn(4, 3, 224, 224)
# Load model from Hugging Face Hub and run inference
results = quick_sngp_inference(
"wong_sngp_resnet18",
batch,
device="cuda" # or "cpu"
)
# Outputs:
# - results["logits"]: Raw model outputs
# - results["predictions"]: Predicted class indices
# - results["confidence"]: Prediction confidence scores
# - results["variance"]: Uncertainty estimates
# - results["probabilities"]: Class probabilities
print(f"Predictions: {results['predictions'].tolist()}")
print(f"Confidence: {results['confidence'].tolist()}")
print(f"Uncertainty (variance): {results['variance'].tolist()}")
Baseline inference (deterministic)
import torch
from scripts.example_inference import quick_baseline_inference
batch = torch.randn(4, 3, 224, 224)
results = quick_baseline_inference(
"wong_baseline_resnet18",
batch,
device="cuda" # or "cpu"
)
print(f"Predictions: {results['predictions'].tolist()}")
print(f"Confidence: {results['confidence'].tolist()}")
Uses
Direct Use
- Image classification in biomedical imaging datasets
- Estimation of predictive uncertainty via entropy/logit-based measures
- Detection of out-of-distribution (OOD) samples in medical imaging workflows
Downstream Use
- Integration into clinical decision-support pipelines (research only)
- Benchmarking uncertainty estimation methods (SNGP vs MC Dropout vs deterministic)
- Domain shift detection across institutions or datasets
Out-of-Scope Use
- Clinical diagnosis without expert oversight
- Deployment in safety-critical settings without validation
- Use on imaging modalities or domains not represented in training data
Bias, Risks, and Limitations
Limitations
- Performance depends on dataset domain similarity (scanner, staining, preprocessing)
- OOD detection is not guaranteed to capture all distribution shifts
- Models trained on limited public datasets; may not generalize to all populations
Risks
- Misinterpretation of uncertainty estimates as calibrated probabilities
- False confidence on near-OOD samples
- Dataset-specific biases (e.g., acquisition site, staining protocols)
Recommendations
- Always use with human-in-the-loop (e.g., pathologist review)
- Validate on local institutional data before deployment
- Use uncertainty thresholds conservatively for rejection