Spaces:

tampee
/

mammogram-analyzer

Sleeping

App Files Files Community

mammogram-analyzer / README.md

tampee

feat: add Dockerfile and HF Spaces config for deployment

f192d59 2 months ago

preview code

raw

history blame contribute delete

6.73 kB

	---
	title: Mammogram Analyzer
	emoji: 🏥
	colorFrom: pink
	colorTo: red
	sdk: docker
	app_port: 7860
	---

	# Mammogram Inference Service

	A FastAPI microservice that runs mammographic image analysis using the SensiNet dual-stream deep learning model. It accepts a public image URL or a direct file upload, runs Bayesian MC-Dropout inference, and returns a BI-RADS classification with confidence and malignancy probability scores.

	This service is the AI backend for the [Blossom](../README.md) clinical radiology platform.

	---

	## Authorship and Attribution

	### This service (the wrapper)

	The FastAPI service, inference pipeline, training script, and Blossom integration code in this repository were written by the Blossom team. They are original works that integrate the SensiNet model.

	### The AI model (SensiNet)

	The neural network architecture (`app/architecture.py`) and pretrained weights (`weights/advanced_model_best.pth`) are the work of Aredeksu and the SensiNet-Mammography project.

	> Original model: [Aredeksu/SensiNet-Mammography](https://huggingface.co/Aredeksu/SensiNet-Mammography) on Hugging Face
	> License: Apache License 2.0
	> Trained on: CBIS-DDSM mammography dataset

	We are integrators and licensees of this model — not its authors or joint authors. Full credit for the model architecture, training methodology, and pretrained weights belongs to the original authors.

	In compliance with the Apache 2.0 license:
	- The original architecture is reproduced with attribution in `app/architecture.py`
	- Modifications made by us: added the FastAPI wrapper, Bayesian MC-Dropout inference loop, BI-RADS mapping logic, and the training script for fine-tuning
	- A copy of the Apache 2.0 license is included in `LICENSE`

	### Training data

	The model was trained on the CBIS-DDSM (Curated Breast Imaging Subset of DDSM) dataset, a publicly available mammography benchmark dataset from The Cancer Imaging Archive (TCIA).

	> Lee RS, Gimenez F, Hoogi A, Miyake KK, Gorovoy M, Rubin DL. (2017). A curated mammography data set for use in computer-aided detection and diagnosis research. Scientific Data, 4, 170177. https://doi.org/10.1038/sdata.2017.177

	---

	## Architecture (SensiNet)

	```
	Input Image (299×299 RGB)
	\|
	┌────┴────┐
	▼ ▼
	Xception EfficientNet-B3
	(2048ch) (1536ch)
	│ │
	▼ ▼
	Proj→512 Proj→512
	└────┬────┘
	▼
	Concat (1024ch)
	▼
	CBAM Attention
	(channel + spatial)
	▼
	GlobalAvgPool → Linear(1024→512) → BN → ReLU → Dropout(0.5) → Linear(512→1)
	▼
	Sigmoid → malignancy probability → BI-RADS 1–5
	```

	Inference uses Bayesian MC-Dropout (10 stochastic forward passes) to estimate both the mean malignancy probability and prediction variance, which informs the confidence score.

	---

	## Endpoints

	\| Method \| Path \| Description \|
	\|--------\|------\|-------------\|
	\| `GET` \| `/health` \| Service health check, model mode, version \|
	\| `POST` \| `/predict` \| Multipart image upload → inference \|
	\| `POST` \| `/analyze` \| JSON `{ image_url }` → download → inference \|

	### `/analyze` request

	```json
	{ "image_url": "https://your-storage.supabase.co/storage/v1/object/public/mammograms/..." }
	```

	### Response shape

	```json
	{
	"birads": 3,
	"confidence": 0.82,
	"malignancy_probability": 0.31,
	"findings_text": "Model prediction: Benign (probability 31.0%). Probably benign appearance. Short-interval follow-up may be considered.",
	"model_version": "sensinet-v1"
	}
	```

	BI-RADS mapping:

	\| Probability \| BI-RADS \| Interpretation \|
	\|-------------\|---------\|---------------\|
	\| < 10% \| 1 \| Negative \|
	\| 10–24% \| 2 \| Benign \|
	\| 25–49% \| 3 \| Probably benign \|
	\| 50–74% \| 4 \| Suspicious \|
	\| ≥ 75% \| 5 \| Highly suggestive of malignancy \|

	---

	## Setup

	```bash
	cd mammogram-inference-service
	python3 -m venv .venv
	source .venv/bin/activate
	pip install --upgrade pip
	pip install -r requirements.txt
	```

	### Model weights

	Download the pretrained weights from [Aredeksu/SensiNet-Mammography](https://huggingface.co/Aredeksu/SensiNet-Mammography) on Hugging Face and place the file at:

	```
	weights/advanced_model_best.pth
	```

	If the weights file is absent, the service automatically falls back to mock mode — a deterministic pixel-statistics-based predictor that returns consistent (but not clinically meaningful) results. Useful for UI development without a GPU.

	### Environment variables

	\| Variable \| Default \| Description \|
	\|----------\|---------\|-------------\|
	\| `MODEL_MODE` \| `real` \| Set to `mock` to force mock mode \|
	\| `MODEL_VERSION` \| `sensinet-v1` \| Version string returned in responses \|
	\| `MODEL_WEIGHTS` \| `weights/advanced_model_best.pth` \| Path to weights file \|
	\| `ALLOWED_IMAGE_HOSTS` \| _(empty = allow all HTTPS)_ \| Comma-separated allowlist of image hostnames (SSRF protection) \|

	---

	## Run

	```bash
	uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
	```

	- Swagger UI: http://localhost:8000/docs
	- Health check: http://localhost:8000/health

	In your Blossom `.env.local`:

	```env
	GCLOUD_MODEL_ENDPOINT=http://localhost:8000
	```

	---

	## Training your own weights

	If you have a local copy of the CBIS-DDSM dataset, you can fine-tune the model:

	```bash
	# Step 1: organise images into train/val folders
	python prepare_data.py \
	--images /path/to/raw/images \
	--csv /path/to/labels.csv \
	--output data

	# Step 2: train (two-phase: frozen backbones → full fine-tune)
	python train.py --data data --output weights/advanced_model_best.pth
	```

	The training script uses the same `AdvancedBreastCancerModel` architecture as the original SensiNet. Phase 1 trains only the projection layers and classifier head with frozen backbones (20 epochs). Phase 2 fine-tunes all layers at a lower learning rate (50 epochs). Best checkpoint is saved automatically.

	---

	## Important notice

	This service is intended for research and development use only. It has not been validated for clinical decision-making. Outputs must not be used to diagnose, treat, or manage patients without appropriate clinical oversight and regulatory approval. The BI-RADS scores produced are AI-generated estimates and do not replace radiologist interpretation.

	---

	## License

	The wrapper code in this repository (FastAPI service, training scripts, integration layer) is original work by the Blossom team.

	The SensiNet model architecture and weights are used under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0). Copyright belongs to the original authors at [Aredeksu/SensiNet-Mammography](https://huggingface.co/Aredeksu/SensiNet-Mammography).