| README — Wound infection classification models | |
| Project summary | |
| - Binary image classifier that labels wound images as clean (0) or infected (1). | |
| - Trained and evaluated multiple pretrained backbones (EfficientNetV2, MobileNetV3, ConvNeXt‑Tiny) on the same curated dataset (70/15/15 split). | |
| - This repo contains model weights, evaluation metrics, and example inference code. | |
| Quick notes | |
| - Labels are manual and non‑clinical. Models are for proof‑of‑concept research only — not for clinical use. | |
| - Outputs were padded to length 1000 to match the Executorch runtime format used for mobile export. For normal PyTorch inference the first two logits correspond to the two classes. | |
| Requirements | |
| - Python 3.12 (recommended for reproducing environment) | |
| - PyTorch (compatible with your hardware; tested with torch >=2.8,<3.0) | |
| - torchvision | |
| - numpy | |
| - scikit-learn | |
| - pillow | |
| - (Optional) executorch / runtime exporters if you use the .pte mobile artifacts | |
| Models & evaluation summary | |
| - Label mapping: 0 = clean, 1 = infected | |
| Metrics reported on the test set (per training environment/run): | |
| EfficientNetV2 (new dataset) | |
| - env6: Precision 0.826 | Recall 0.856 | F1 0.840 | TP 237 | FP 50 | TN 89 | FN 40 | |
| - env8: Precision 0.836 | Recall 0.866 | F1 0.851 | TP 240 | FP 47 | TN 92 | FN 37 | |
| - env7: Precision 0.814 | Recall 0.884 | F1 0.848 | TP 245 | FP 56 | TN 83 | FN 32 | |
| MobileNetV3 (new dataset) | |
| - env6: Precision 0.869 | Recall 0.791 | F1 0.828 | TP 219 | FP 33 | TN 106 | FN 58 | |
| - env8: Precision 0.898 | Recall 0.697 | F1 0.785 | TP 193 | FP 22 | TN 117 | FN 84 | |
| - env7: Precision 0.849 | Recall 0.874 | F1 0.861 | TP 242 | FP 43 | TN 96 | FN 35 | |
| ConvNeXt‑Tiny (new dataset) | |
| - env6: Precision 0.839 | Recall 0.863 | F1 0.851 | TP 239 | FP 46 | TN 93 | FN 38 | |
| Chosen / recommended checkpoints | |
| - EfficientNetV2: env8 (balanced best F1 / ROC/PR behaviour) | |
| - MobileNetV3: env7 (best tradeoff of F1 and latency for mobile) | |
| - ConvNeXt‑Tiny: env6 (best calibration + strong ROC/PR and CPU efficiency) | |
| Why these were chosen | |
| - EfficientNetV2 env8: solid ROC/PR performance and high recall for infected cases. | |
| - MobileNetV3 env7: highest infected-class F1 in our runs and good inference latency for mobile deployment. | |
| - ConvNeXt‑Tiny env6: strongest calibration (reliability diagram), top PR/ROC equivalence and best average CPU usage — selected as the most feasible single model for deployment. | |
| Repository contents | |
| - models/ | |
| - efficientnetv2_env8.pte | |
| - mobilenetv3_env7.pte | |
| - convnext_tiny_env6.pte | |
| - (optional) .pte mobile artifacts if exported for Executorch | |
| - notebooks/ or examples/ | |
| - eval_metrics.ipynb — code to reproduce ROC/PR/reliability diagrams and confusion matrices | |
| - inference_example.py — minimal inference script | |
| - README.md (this file) | |
| Notes on mobile export | |
| - Models were exported to a mobile format (.pte) using torch.export → executorch lowering → XNNPackPartitioner. If you rely on mobile runtime artifacts, use the matching .pte file and follow Executorch runtime integration instructions. | |
| - Outputs were padded to 1000-length vectors for compatibility with the runtime. Padding uses a large negative value (−100) to avoid stealing probability mass. | |
| Calibration & decision thresholds | |
| - Reliability diagrams show calibration issues; raw probabilities are not fully trustworthy. | |
| - Before using probabilities for clinical decisions, apply post‑hoc calibration (temperature scaling or isotonic regression) and re‑compute reliability diagrams and decision thresholds. | |
| - For single-number model selection we reported per-class precision/recall/F1. For deployment choose thresholds based on calibrated probabilities and the clinical tradeoff between false positives and false negatives. | |
| Data & labeling | |
| - Dataset: curated from public sources (Kaggle wound images) and manually labeled into clean vs infected. | |
| - Labels are non‑clinical and may contain noise. Use caution; validate with clinical experts for any real-world deployment. |