PlasmoSENet — Malaria Parasite Detection from Blood Smear Microscopy

PlasmoSENet is a custom CNN architecture designed from first principles for automated malaria parasite detection in thin blood smear microscopy images. It is part of the LocalMedScan project for offline, privacy-first medical image screening.

Model Details

Property	Value
Architecture	PlasmoSENet (custom ResNet-style with SE attention)
Parameters	10,632,306 (~10.6M)
Model Size	40.7 MB
Input	RGB 224x224 blood smear cell images
Output	Binary classification: Parasitized vs Uninfected
Training	From scratch (no pretrained weights)
Framework	PyTorch
License	MIT

Architecture Highlights

PlasmoSENet integrates four domain-specific design elements:

Multi-Scale Stem — Parallel 3x3, 5x5, and 7x7 convolutional branches whose receptive fields are calibrated to the physical dimensions of Plasmodium erythrocytic stages:
- Ring forms (~2-3 um) captured by 3x3 branch
- Trophozoites (~4-6 um) captured by 5x5 branch
- Schizonts (~6-8 um) captured by 7x7 branch
Squeeze-and-Excitation (SE) Channel Attention — In every residual block, learns stain-aware feature weightings that amplify diagnostically relevant purple/blue parasite signals over pink/red erythrocyte background.
Stochastic Depth (DropPath) — Linearly increasing drop rates (0.0 to 0.2) across 11 residual blocks, providing implicit ensemble regularization over O(2^11) = 2048 subnetworks.
Kaiming Initialization — Fan-out mode for stable from-scratch convergence without ImageNet pretrained weights.

Architecture Table

Input: (B, 3, 224, 224)
MultiScaleStem(3 -> 64)              -> (B, 64, 112, 112)
Stage 1: 2x SEResBlock(64 -> 64)    -> (B, 64, 112, 112)
Stage 2: 3x SEResBlock(64 -> 128)   -> (B, 128, 56, 56)
Stage 3: 4x SEResBlock(128 -> 256)  -> (B, 256, 28, 28)
Stage 4: 2x SEResBlock(256 -> 384)  -> (B, 384, 14, 14)
Head Conv: Conv1x1 + BN + ReLU      -> (B, 384, 14, 14)
Classifier: GAP -> Dropout(0.3) -> Linear(384, 2)

Performance

Test Results (NIH Malaria Dataset, 2,757 test images)

Metric	Without TTA	With 5-View TTA
Accuracy	96.66%	96.55%
Sensitivity	—	96.08%
Specificity	—	97.01%
Precision	—	96.87%
F1 Score	—	96.47%

Confusion Matrix (with TTA)

                   Predicted +  Predicted -
  Actual +              1,298           53
  Actual -                 42        1,364

Comparison with MobileNetV2 Baseline

Model	Params	Size	Accuracy	Sensitivity	Specificity	F1
MobileNetV2 (fine-tuned)	3.4M	14 MB	97.97%	97.41%	98.51%	97.95%
PlasmoSENet (from scratch)	10.6M	41 MB	96.55%	96.08%	97.01%	96.47%

MobileNetV2 with transfer learning outperforms PlasmoSENet by 1.42 percentage points, consistent with the well-established advantage of pretrained features on moderately-sized medical datasets. However, PlasmoSENet demonstrates that clinically useful accuracy (>96%) is achievable without any pretrained weights — significant for regulated medical device contexts.

Training Details

Hyperparameter	Value
Optimizer	AdamW (weight_decay=0.05)
Learning Rate	Linear warmup 10 epochs (1e-5 to 1e-3) + cosine annealing
Batch Size	64
Label Smoothing	0.1
Mixup	alpha=0.2, 50% probability
CutMix	alpha=1.0, 50% probability
Stochastic Depth	0.0 to 0.2 linearly
Dropout	0.3
Gradient Clipping	max_norm=1.0
Mixed Precision	AMP (float16/float32)
Early Stopping	patience=20 (stopped at epoch 60)
Best Validation	96.99% (epoch 40)
Hardware	NVIDIA RTX 3070 (8 GB), ~80s/epoch

Data Augmentation

RandomResizedCrop(224, scale=0.7-1.0)
RandomHorizontalFlip + RandomVerticalFlip
RandomRotation(90 degrees) — blood cells are rotationally invariant
RandomAffine(translate=0.1, scale=0.9-1.1)
ColorJitter(0.4, 0.4, 0.3, 0.05)
GaussianBlur(kernel=3, sigma=0.1-2.0)
RandomErasing(p=0.25, scale=0.02-0.2)

Critical Bug Fix: TransformSubset

We identified and corrected a subtle bug in standard PyTorch training pipelines using random_split. Because Subset objects share the parent dataset reference, setting transforms on one subset overwrites transforms for all subsets. We implement a TransformSubset wrapper that maintains independent transforms per partition — without this fix, training runs with no augmentation, potentially inflating published results.

Dataset

NIH Malaria Cell Images Dataset (Rajaraman et al., 2018)

27,558 annotated cell images from Giemsa-stained thin blood smears
13,779 Parasitized + 13,779 Uninfected (perfectly balanced)
Split: 80% train (22,046) / 10% val (2,755) / 10% test (2,757)
Fixed seed (42) for reproducibility

Usage

import torch
from core.plasmosenet import PlasmoSENet

# Load model
model = PlasmoSENet(num_classes=2, drop_path_rate=0.0)  # No DropPath at inference
state_dict = torch.load("model.pth", map_location="cpu", weights_only=True)
model.load_state_dict(state_dict)
model.eval()

# Inference
from torchvision import transforms
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

from PIL import Image
img = Image.open("blood_smear_cell.png").convert("RGB")
tensor = transform(img).unsqueeze(0)

with torch.inference_mode():
    probs = torch.softmax(model(tensor), dim=1)
    # probs[0][0] = Parasitized probability
    # probs[0][1] = Uninfected probability

Key Achievements

Novel domain-specific architecture: First CNN combining multi-scale stem, per-block SE attention, and stochastic depth specifically for malaria detection
96.55% accuracy from scratch: Clinically useful without any pretrained weights
TransformSubset bug fix: Identified and corrected a common PyTorch training pipeline bug
Comprehensive regularization: Demonstrated that heavy regularization (Mixup, CutMix, DropPath, label smoothing) enables training a 10.6M parameter model on just 27K images without overfitting
GradCAM compatible: Head conv layer serves as interpretability target for clinical validation
Unique architecture name: "PlasmoSENet" verified unique via Google Scholar (zero prior results)

Limitations

Trained on single dataset (NIH, one geographic site)
Binary classification only (no species/stage identification)
Requires pre-segmented cell images
Did not surpass transfer learning baseline on this dataset size

Citation

If you use this model, please cite:

@misc{plasmosenet2026,
  title={PlasmoSENet: A Multi-Scale Squeeze-and-Excitation Residual Network for Malaria Parasite Detection},
  author={Svetozar Technologies},
  year={2026},
  url={https://huggingface.co/Svetozar1993/LocalMedScan-malaria-plasmosenet}
}

Svetozar1993
/

LocalMedScan-malaria-plasmosenet