arudaev
/

chexvision-mini

Image Classification

medical-imaging

Model card Files Files and versions

chexvision-mini / README.md

arudaev's picture

Upload folder using huggingface_hub

8f4515c verified 19 days ago

|

History Blame Contribute Delete

3.12 kB

	---
	license: mit
	library_name: numpy
	tags:
	- chest-xray
	- medical-imaging
	- from-scratch
	- numpy
	- education
	pipeline_tag: image-classification
	---

	# CheXVision-mini — from-scratch NumPy neural network

	A pure-NumPy multilayer perceptron (no autograd, no deep-learning framework),
	with every forward and backward pass derived and coded by hand, trained for
	binary chest X-ray screening (normal vs abnormal) on NIH ChestX-ray14.

	Companion to [CheXVision](https://github.com/arudaev/chexvision) (PyTorch: a
	custom CNN + a DenseNet-121 transfer model). This model demonstrates the
	fundamentals — hand-written backprop verified by finite-difference gradient
	checking. It is intentionally a fundamentals demo: the headline performance
	belongs to the PyTorch models (DenseNet binary AUC ≈ 0.787), not to this MLP.

	## Results — held-out test set (final)

	Metrics on an untouched test split, at an operating threshold chosen on the
	validation set only (Youden's J = 0.389). ROC-AUC is threshold-independent.

	\| Metric \| Test \| Validation \|
	\|---\|---\|---\|
	\| ROC-AUC \| 0.6502 \| 0.6994 \|
	\| Accuracy \| 0.6467 \| 0.6536 \|
	\| Balanced accuracy \| 0.5904 \| 0.6517 \|
	\| Precision \| 0.6749 \| 0.5803 \|
	\| Recall (sensitivity) \| 0.8277 \| 0.6393 \|
	\| Specificity \| 0.3530 \| 0.6640 \|
	\| F1 \| 0.7435 \| 0.6084 \|

	Checkpoint selected by best validation AUC (epoch 176/200).
	Samples — train 60000, val 8557, test 10000
	(test positive rate 0.6187).
	Test confusion matrix @ 0.389: TN=1346, FP=2467, FN=1066, TP=5121.

	> Note on the test split: NIH ChestX-ray14's official `test` split is more
	> positive-heavy (0.6187) than train/validation
	> (0.4208). Because of that base-rate shift, plain accuracy
	> can mislead — **ROC-AUC (threshold-independent) and balanced accuracy are the
	> metrics to trust** for comparison.

	## Architecture

	MLP on 64×64 grayscale images: 4096 → 1024 → 256 → 64 → 1 logit,
	ReLU activations, dropout 0.3, He initialisation.
	Loss: BCE-with-logits (+ label smoothing 0.05).
	Optimizer: adam with cosine LR decay; L2 weight decay
	(weights only). Per-feature standardisation; augmentation: h-flip / noise / brightness.

	## Files

	- `model.npz` — best weights + normalisation stats (`_norm_mean`, `_norm_std`).
	- `metrics.json` — test & validation metrics, ROC/PR curves, confusion matrices, config.
	- `history.json` — per-epoch train/reg/val loss, val accuracy/AUC, learning rate.
	- `val_scores.npy` / `val_labels.npy`, `test_scores.npy` / `test_labels.npy` — raw scores + labels.
	- `loss_curve.png` — training curves + val AUC.

	## Usage

	```python
	from chexvision_mini.inference import load_checkpoint, preprocess_image, predict_label
	model, mean, std, threshold = load_checkpoint("artifacts")
	x = preprocess_image("xray.png", image_size=64, mean=mean, std=std)
	prob, label = predict_label(model, x, threshold) # P(abnormal), "normal"/"abnormal"
	```

	Or from the CLI: `python -m chexvision_mini predict --checkpoint artifacts --image xray.png`.

	## Links

	- Code: https://github.com/arudaev/chexvision-mini
	- Parent project: https://github.com/arudaev/chexvision