Malware Classification CNN on Malimg

Trained models for 25-class malware family classification on the Malimg dataset. Four checkpoints from a phase-based optimization study, going from a baseline CNN (89.06%) to an EfficientNetB0-based model (98.48% with TTA).

GitHub (code, reports, training curves): github.com/ffftuanxxx/malware-classification-CNN-optimized

Checkpoints

File	Phase	Architecture	Val Accuracy	Macro F1	Size
`baseline/pesi.h5`	Baseline	3-block Conv + Flatten + Dense(256) (59M params)	89.06%	86.45%	227 MB
`phase1/best_model.h5`	Phase 1	Conv + BN + GAP + Dense (<1M params)	67.28%	39.28%	5.2 MB
`phase2/best_model.h5`	Phase 2	EfficientNetB0 + Dense head, two-stage fine-tune	94.91%	83.95%	30 MB
`phase3/best_model.h5`	Phase 3	Phase 2 + Focal Loss + oversampling + Cosine + TTA	98.48%	95.78%	30 MB

All metrics are on the 923-sample Malimg val split.

How to Load

from huggingface_hub import hf_hub_download
import tensorflow as tf

# Download any checkpoint
path = hf_hub_download(repo_id="XRailgunX/malware-cnn-malimg",
                       filename="phase3/best_model.h5")

# Phase 1/2/3 saved full models; load directly
model = tf.keras.models.load_model(path, compile=False)

# Baseline is weights-only; rebuild architecture first
# (see run_malimg_classifier.py in the GitHub repo), then:
# model.load_weights(path)

For Phase 3, which uses a custom Focal Loss, pass compile=False when loading (the loss function is not serialized). Then recompile if you plan to continue training.

Input Format

Input shape: (256, 256, 3) RGB (grayscale Malimg images replicated to 3 channels for ImageNet-pretrained backbones).
Preprocessing:
- Baseline / Phase 1: rescale to [0, 1] (x / 255.0).
- Phase 2 / Phase 3 (EfficientNet-based): keep raw [0, 255] float — EfficientNet has a built-in Normalization layer. Applying rescale=1./255 externally will cause predictions to collapse.

Classes

25 Malimg malware families (class indices in alphabetical order): Adialer.C, Agent.FYI, Allaple.A, Allaple.L, Alueron.gen!J, Autorun.K, C2LOP.P, C2LOP.gen!g, Dialplatform.B, Dontovo.A, Fakerean, Instantaccess, Lolyda.AA1, Lolyda.AA2, Lolyda.AA3, Lolyda.AT, Malex.gen!J, Obfuscator.AD, Rbot!gen, Skintrim.N, Swizzor.gen!E, Swizzor.gen!I, VB.AT, Wintrim.BX, Yuner.A.

Training Details (Phase 3, best model)

Backbone: EfficientNetB0 (ImageNet pretrained)
Head: GlobalAveragePooling2D → Dense(256, relu) + BN + Dropout(0.5) → Dense(25, softmax)
Stage 1 (10 epochs, frozen base, Adam 1e-3): 68% → 91% val acc
Stage 2 (30 epochs, top 20 layers unfrozen, Cosine decay from 1e-4 to 1.3e-6): best val_loss 0.01090 at epoch 24
Loss: Focal Loss (γ=2.0, α=0.25)
Augmentation: width/height_shift=0.1, horizontal_flip=True, zoom=0.1, brightness=[0.9,1.1] (NO vertical flip, rotation, or color jitter)
Minority oversampling: bootstrap each class with <200 samples to 200
TTA at inference: 5 random augmentations averaged → +1.41pp over single-view prediction

Training curves, per-class F1, and full technical report are in the GitHub repo.

Per-class F1 (Phase 3)

Perfect (F1 = 1.00) on 19 of 25 classes. Remaining:

Class	Support	F1
Autorun.K	10	0.86
C2LOP.P	14	0.86
C2LOP.gen!g	20	0.95
Swizzor.gen!E	12	0.77
Swizzor.gen!I	13	0.53
Yuner.A	80	0.98

Swizzor.gen!I is the hardest remaining case — a near-visual duplicate of Swizzor.gen!E. Single-model upper bound on Malimg seems to be around this level; reaching 99%+ typically requires multi-model ensembles or byte-level auxiliary features.

Limitations

Trained and evaluated only on Malimg (25 PE malware families, grayscale byte images). Does not transfer to mobile malware, scripts, or obfuscated packers outside this distribution.
Not calibrated for out-of-distribution detection. Softmax confidence on non-Malimg inputs is unreliable.
Evaluation is on the val split that was also used for model selection, so the reported numbers are slightly optimistic vs a held-out test run.

Citation

If you build on this work, please also cite the upstream baseline repo cridin1/malware-classification-CNN and the Malimg dataset (Nataraj et al., 2011).

Downloads last month: -