Update README.md

6fbc75b verified 6 months ago

4.03 kB

	README — Wound infection classification models


	Project summary


	- Binary image classifier that labels wound images as clean (0) or infected (1).

	- Trained and evaluated multiple pretrained backbones (EfficientNetV2, MobileNetV3, ConvNeXt‑Tiny) on the same curated dataset (70/15/15 split).

	- This repo contains model weights, evaluation metrics, and example inference code.

	Quick notes


	- Labels are manual and non‑clinical. Models are for proof‑of‑concept research only — not for clinical use.

	- Outputs were padded to length 1000 to match the Executorch runtime format used for mobile export. For normal PyTorch inference the first two logits correspond to the two classes.

	Requirements


	- Python 3.12 (recommended for reproducing environment)

	- PyTorch (compatible with your hardware; tested with torch >=2.8,<3.0)

	- torchvision

	- numpy

	- scikit-learn

	- pillow

	- (Optional) executorch / runtime exporters if you use the .pte mobile artifacts

	Models & evaluation summary


	- Label mapping: 0 = clean, 1 = infected

	Metrics reported on the test set (per training environment/run):

	EfficientNetV2 (new dataset)


	- env6: Precision 0.826 \| Recall 0.856 \| F1 0.840 \| TP 237 \| FP 50 \| TN 89 \| FN 40

	- env8: Precision 0.836 \| Recall 0.866 \| F1 0.851 \| TP 240 \| FP 47 \| TN 92 \| FN 37

	- env7: Precision 0.814 \| Recall 0.884 \| F1 0.848 \| TP 245 \| FP 56 \| TN 83 \| FN 32

	MobileNetV3 (new dataset)


	- env6: Precision 0.869 \| Recall 0.791 \| F1 0.828 \| TP 219 \| FP 33 \| TN 106 \| FN 58

	- env8: Precision 0.898 \| Recall 0.697 \| F1 0.785 \| TP 193 \| FP 22 \| TN 117 \| FN 84

	- env7: Precision 0.849 \| Recall 0.874 \| F1 0.861 \| TP 242 \| FP 43 \| TN 96 \| FN 35

	ConvNeXt‑Tiny (new dataset)


	- env6: Precision 0.839 \| Recall 0.863 \| F1 0.851 \| TP 239 \| FP 46 \| TN 93 \| FN 38

	Chosen / recommended checkpoints


	- EfficientNetV2: env8 (balanced best F1 / ROC/PR behaviour)

	- MobileNetV3: env7 (best tradeoff of F1 and latency for mobile)

	- ConvNeXt‑Tiny: env6 (best calibration + strong ROC/PR and CPU efficiency)

	Why these were chosen


	- EfficientNetV2 env8: solid ROC/PR performance and high recall for infected cases.

	- MobileNetV3 env7: highest infected-class F1 in our runs and good inference latency for mobile deployment.

	- ConvNeXt‑Tiny env6: strongest calibration (reliability diagram), top PR/ROC equivalence and best average CPU usage — selected as the most feasible single model for deployment.

	Repository contents


	- models/
	- efficientnetv2_env8.pte

	- mobilenetv3_env7.pte

	- convnext_tiny_env6.pte

	- (optional) .pte mobile artifacts if exported for Executorch


	- notebooks/ or examples/
	- eval_metrics.ipynb — code to reproduce ROC/PR/reliability diagrams and confusion matrices

	- inference_example.py — minimal inference script


	- README.md (this file)

	Notes on mobile export


	- Models were exported to a mobile format (.pte) using torch.export → executorch lowering → XNNPackPartitioner. If you rely on mobile runtime artifacts, use the matching .pte file and follow Executorch runtime integration instructions.

	- Outputs were padded to 1000-length vectors for compatibility with the runtime. Padding uses a large negative value (−100) to avoid stealing probability mass.

	Calibration & decision thresholds


	- Reliability diagrams show calibration issues; raw probabilities are not fully trustworthy.

	- Before using probabilities for clinical decisions, apply post‑hoc calibration (temperature scaling or isotonic regression) and re‑compute reliability diagrams and decision thresholds.

	- For single-number model selection we reported per-class precision/recall/F1. For deployment choose thresholds based on calibrated probabilities and the clinical tradeoff between false positives and false negatives.

	Data & labeling


	- Dataset: curated from public sources (Kaggle wound images) and manually labeled into clean vs infected.

	- Labels are non‑clinical and may contain noise. Use caution; validate with clinical experts for any real-world deployment.