Update README.md

2737ae8 verified about 2 months ago

4.2 kB

	---
	license: mit
	library_name: pytorch
	pipeline_tag: image-classification
	tags:
	- sngp
	- uncertainty-estimation
	- out-of-distribution-detection
	- biomedical-imaging
	- digital-pathology
	- histopathology
	- model-calibration
	- reliable-ai
	datasets:
	- acevedo2020
	- jung2022
	- tang2019
	- wong2022
	- kather2016
	- kather2018
	---

	# SNGP Models for Uncertainty-Aware Biomedical Image Classification

	## Model Details

	### Model Description

	This repository contains trained Spectral-normalized Neural Gaussian Process (SNGP) models for uncertainty-aware image classification in biomedical imaging tasks, including white blood cells, amyloid plaques, and colorectal histopathology.

	SNGP augments standard deep neural networks by applying spectral normalization and replacing the final dense layer with a Gaussian process layer, enabling improved uncertainty estimation and out-of-distribution (OOD) detection with a single forward pass.

	- Developed by: Uma Meleti, Jeffrey J. Nirschl
	- Affiliation: University of Wisconsin-Madison
	- Model type: Convolutional neural network (ResNet18 backbone) with SNGP head
	- License: MIT
	- Paper: https://arxiv.org/abs/2602.02370
	- Repository: [https://github.com/nirschl-lab/sngp_core]

	---

	## How to Get Started with the Model
	Load pretrained SNGP models from the Hugging Face Hub using the provided inference utilities.

	### Installation
	#### Clone repo and install
	```bash
	# Clone repository
	git clone https://github.com/nirschl-lab/sngp_core
	cd sngp_core

	# Install uv
	curl -Ls https://astral.sh/uv/install.sh \| sh

	# Install dependencies
	uv sync
	```

	#### Python API
	SNGP Inference with uncertainty quantification
	```python
	import torch
	from scripts.example_inference import quick_sngp_inference

	# Create input batch [batch_size, channels, height, width]
	batch = torch.randn(4, 3, 224, 224)

	# Load model from Hugging Face Hub and run inference
	results = quick_sngp_inference(
	"wong_sngp_resnet18",
	batch,
	device="cuda" # or "cpu"
	)

	# Outputs:
	# - results["logits"]: Raw model outputs
	# - results["predictions"]: Predicted class indices
	# - results["confidence"]: Prediction confidence scores
	# - results["variance"]: Uncertainty estimates
	# - results["probabilities"]: Class probabilities

	print(f"Predictions: {results['predictions'].tolist()}")
	print(f"Confidence: {results['confidence'].tolist()}")
	print(f"Uncertainty (variance): {results['variance'].tolist()}")
	```

	#### Baseline inference (deterministic)
	```python
	import torch
	from scripts.example_inference import quick_baseline_inference

	batch = torch.randn(4, 3, 224, 224)

	results = quick_baseline_inference(
	"wong_baseline_resnet18",
	batch,
	device="cuda" # or "cpu"
	)

	print(f"Predictions: {results['predictions'].tolist()}")
	print(f"Confidence: {results['confidence'].tolist()}")
	```
	---
	## Uses

	### Direct Use
	- Image classification in biomedical imaging datasets
	- Estimation of predictive uncertainty via entropy/logit-based measures
	- Detection of out-of-distribution (OOD) samples in medical imaging workflows

	### Downstream Use
	- Integration into clinical decision-support pipelines (research only)
	- Benchmarking uncertainty estimation methods (SNGP vs MC Dropout vs deterministic)
	- Domain shift detection across institutions or datasets

	### Out-of-Scope Use
	- Clinical diagnosis without expert oversight
	- Deployment in safety-critical settings without validation
	- Use on imaging modalities or domains not represented in training data

	---

	## Bias, Risks, and Limitations

	### Limitations
	- Performance depends on dataset domain similarity (scanner, staining, preprocessing)
	- OOD detection is not guaranteed to capture all distribution shifts
	- Models trained on limited public datasets; may not generalize to all populations

	### Risks
	- Misinterpretation of uncertainty estimates as calibrated probabilities
	- False confidence on near-OOD samples
	- Dataset-specific biases (e.g., acquisition site, staining protocols)

	### Recommendations
	- Always use with human-in-the-loop (e.g., pathologist review)
	- Validate on local institutional data before deployment
	- Use uncertainty thresholds conservatively for rejection