chexvision-mini / README.md
arudaev's picture
Upload folder using huggingface_hub
8f4515c verified
|
Raw
History Blame Contribute Delete
3.12 kB
---
license: mit
library_name: numpy
tags:
- chest-xray
- medical-imaging
- from-scratch
- numpy
- education
pipeline_tag: image-classification
---
# CheXVision-mini β€” from-scratch NumPy neural network
A pure-**NumPy** multilayer perceptron (no autograd, no deep-learning framework),
with every forward and backward pass derived and coded by hand, trained for
binary chest X-ray screening (**normal vs abnormal**) on NIH ChestX-ray14.
Companion to [CheXVision](https://github.com/arudaev/chexvision) (PyTorch: a
custom CNN + a DenseNet-121 transfer model). This model demonstrates the
**fundamentals** β€” hand-written backprop verified by finite-difference gradient
checking. It is intentionally a fundamentals demo: the headline performance
belongs to the PyTorch models (DenseNet binary AUC β‰ˆ 0.787), not to this MLP.
## Results β€” held-out test set (final)
Metrics on an **untouched test split**, at an operating threshold chosen on the
validation set only (Youden's J = 0.389). ROC-AUC is threshold-independent.
| Metric | Test | Validation |
|---|---|---|
| ROC-AUC | **0.6502** | 0.6994 |
| Accuracy | 0.6467 | 0.6536 |
| Balanced accuracy | 0.5904 | 0.6517 |
| Precision | 0.6749 | 0.5803 |
| Recall (sensitivity) | 0.8277 | 0.6393 |
| Specificity | 0.3530 | 0.6640 |
| F1 | 0.7435 | 0.6084 |
Checkpoint selected by best validation AUC (epoch 176/200).
Samples β€” train 60000, val 8557, test 10000
(test positive rate 0.6187).
Test confusion matrix @ 0.389: TN=1346, FP=2467, FN=1066, TP=5121.
> **Note on the test split:** NIH ChestX-ray14's official `test` split is more
> positive-heavy (0.6187) than train/validation
> (0.4208). Because of that base-rate shift, plain accuracy
> can mislead β€” **ROC-AUC (threshold-independent) and balanced accuracy are the
> metrics to trust** for comparison.
## Architecture
MLP on 64Γ—64 grayscale images: **4096 β†’ 1024 β†’ 256 β†’ 64 β†’ 1** logit,
ReLU activations, dropout 0.3, He initialisation.
Loss: BCE-with-logits (+ label smoothing 0.05).
Optimizer: adam with cosine LR decay; L2 weight decay
(weights only). Per-feature standardisation; augmentation: h-flip / noise / brightness.
## Files
- `model.npz` β€” best weights + normalisation stats (`_norm_mean`, `_norm_std`).
- `metrics.json` β€” test & validation metrics, ROC/PR curves, confusion matrices, config.
- `history.json` β€” per-epoch train/reg/val loss, val accuracy/AUC, learning rate.
- `val_scores.npy` / `val_labels.npy`, `test_scores.npy` / `test_labels.npy` β€” raw scores + labels.
- `loss_curve.png` β€” training curves + val AUC.
## Usage
```python
from chexvision_mini.inference import load_checkpoint, preprocess_image, predict_label
model, mean, std, threshold = load_checkpoint("artifacts")
x = preprocess_image("xray.png", image_size=64, mean=mean, std=std)
prob, label = predict_label(model, x, threshold) # P(abnormal), "normal"/"abnormal"
```
Or from the CLI: `python -m chexvision_mini predict --checkpoint artifacts --image xray.png`.
## Links
- Code: https://github.com/arudaev/chexvision-mini
- Parent project: https://github.com/arudaev/chexvision