PlasmoSENet β€” Malaria Parasite Detection from Blood Smear Microscopy

PlasmoSENet is a custom CNN architecture designed from first principles for automated malaria parasite detection in thin blood smear microscopy images. It is part of the LocalMedScan project for offline, privacy-first medical image screening.

Model Details

Property Value
Architecture PlasmoSENet (custom ResNet-style with SE attention)
Parameters 10,632,306 (~10.6M)
Model Size 40.7 MB
Input RGB 224x224 blood smear cell images
Output Binary classification: Parasitized vs Uninfected
Training From scratch (no pretrained weights)
Framework PyTorch
License MIT

Architecture Highlights

PlasmoSENet integrates four domain-specific design elements:

  1. Multi-Scale Stem β€” Parallel 3x3, 5x5, and 7x7 convolutional branches whose receptive fields are calibrated to the physical dimensions of Plasmodium erythrocytic stages:

    • Ring forms (~2-3 um) captured by 3x3 branch
    • Trophozoites (~4-6 um) captured by 5x5 branch
    • Schizonts (~6-8 um) captured by 7x7 branch
  2. Squeeze-and-Excitation (SE) Channel Attention β€” In every residual block, learns stain-aware feature weightings that amplify diagnostically relevant purple/blue parasite signals over pink/red erythrocyte background.

  3. Stochastic Depth (DropPath) β€” Linearly increasing drop rates (0.0 to 0.2) across 11 residual blocks, providing implicit ensemble regularization over O(2^11) = 2048 subnetworks.

  4. Kaiming Initialization β€” Fan-out mode for stable from-scratch convergence without ImageNet pretrained weights.

Architecture Table

Input: (B, 3, 224, 224)
MultiScaleStem(3 -> 64)              -> (B, 64, 112, 112)
Stage 1: 2x SEResBlock(64 -> 64)    -> (B, 64, 112, 112)
Stage 2: 3x SEResBlock(64 -> 128)   -> (B, 128, 56, 56)
Stage 3: 4x SEResBlock(128 -> 256)  -> (B, 256, 28, 28)
Stage 4: 2x SEResBlock(256 -> 384)  -> (B, 384, 14, 14)
Head Conv: Conv1x1 + BN + ReLU      -> (B, 384, 14, 14)
Classifier: GAP -> Dropout(0.3) -> Linear(384, 2)

Performance

Test Results (NIH Malaria Dataset, 2,757 test images)

Metric Without TTA With 5-View TTA
Accuracy 96.66% 96.55%
Sensitivity β€” 96.08%
Specificity β€” 97.01%
Precision β€” 96.87%
F1 Score β€” 96.47%

Confusion Matrix (with TTA)

                   Predicted +  Predicted -
  Actual +              1,298           53
  Actual -                 42        1,364

Comparison with MobileNetV2 Baseline

Model Params Size Accuracy Sensitivity Specificity F1
MobileNetV2 (fine-tuned) 3.4M 14 MB 97.97% 97.41% 98.51% 97.95%
PlasmoSENet (from scratch) 10.6M 41 MB 96.55% 96.08% 97.01% 96.47%

MobileNetV2 with transfer learning outperforms PlasmoSENet by 1.42 percentage points, consistent with the well-established advantage of pretrained features on moderately-sized medical datasets. However, PlasmoSENet demonstrates that clinically useful accuracy (>96%) is achievable without any pretrained weights β€” significant for regulated medical device contexts.

Training Details

Hyperparameter Value
Optimizer AdamW (weight_decay=0.05)
Learning Rate Linear warmup 10 epochs (1e-5 to 1e-3) + cosine annealing
Batch Size 64
Label Smoothing 0.1
Mixup alpha=0.2, 50% probability
CutMix alpha=1.0, 50% probability
Stochastic Depth 0.0 to 0.2 linearly
Dropout 0.3
Gradient Clipping max_norm=1.0
Mixed Precision AMP (float16/float32)
Early Stopping patience=20 (stopped at epoch 60)
Best Validation 96.99% (epoch 40)
Hardware NVIDIA RTX 3070 (8 GB), ~80s/epoch

Data Augmentation

  • RandomResizedCrop(224, scale=0.7-1.0)
  • RandomHorizontalFlip + RandomVerticalFlip
  • RandomRotation(90 degrees) β€” blood cells are rotationally invariant
  • RandomAffine(translate=0.1, scale=0.9-1.1)
  • ColorJitter(0.4, 0.4, 0.3, 0.05)
  • GaussianBlur(kernel=3, sigma=0.1-2.0)
  • RandomErasing(p=0.25, scale=0.02-0.2)

Critical Bug Fix: TransformSubset

We identified and corrected a subtle bug in standard PyTorch training pipelines using random_split. Because Subset objects share the parent dataset reference, setting transforms on one subset overwrites transforms for all subsets. We implement a TransformSubset wrapper that maintains independent transforms per partition β€” without this fix, training runs with no augmentation, potentially inflating published results.

Dataset

NIH Malaria Cell Images Dataset (Rajaraman et al., 2018)

  • 27,558 annotated cell images from Giemsa-stained thin blood smears
  • 13,779 Parasitized + 13,779 Uninfected (perfectly balanced)
  • Split: 80% train (22,046) / 10% val (2,755) / 10% test (2,757)
  • Fixed seed (42) for reproducibility

Usage

import torch
from core.plasmosenet import PlasmoSENet

# Load model
model = PlasmoSENet(num_classes=2, drop_path_rate=0.0)  # No DropPath at inference
state_dict = torch.load("model.pth", map_location="cpu", weights_only=True)
model.load_state_dict(state_dict)
model.eval()

# Inference
from torchvision import transforms
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

from PIL import Image
img = Image.open("blood_smear_cell.png").convert("RGB")
tensor = transform(img).unsqueeze(0)

with torch.inference_mode():
    probs = torch.softmax(model(tensor), dim=1)
    # probs[0][0] = Parasitized probability
    # probs[0][1] = Uninfected probability

Key Achievements

  1. Novel domain-specific architecture: First CNN combining multi-scale stem, per-block SE attention, and stochastic depth specifically for malaria detection
  2. 96.55% accuracy from scratch: Clinically useful without any pretrained weights
  3. TransformSubset bug fix: Identified and corrected a common PyTorch training pipeline bug
  4. Comprehensive regularization: Demonstrated that heavy regularization (Mixup, CutMix, DropPath, label smoothing) enables training a 10.6M parameter model on just 27K images without overfitting
  5. GradCAM compatible: Head conv layer serves as interpretability target for clinical validation
  6. Unique architecture name: "PlasmoSENet" verified unique via Google Scholar (zero prior results)

Limitations

  • Trained on single dataset (NIH, one geographic site)
  • Binary classification only (no species/stage identification)
  • Requires pre-segmented cell images
  • Did not surpass transfer learning baseline on this dataset size

Citation

If you use this model, please cite:

@misc{plasmosenet2026,
  title={PlasmoSENet: A Multi-Scale Squeeze-and-Excitation Residual Network for Malaria Parasite Detection},
  author={Svetozar Technologies},
  year={2026},
  url={https://huggingface.co/Svetozar1993/LocalMedScan-malaria-plasmosenet}
}

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support