Cyborg-B144: Vision-Based Malware Threat Classification

Cyborg-B144 is a vision-based malware classification model built on EfficientNet-B5 and trained on grayscale malware binary visualizations. The model predicts high-level malware threat categories from visual representations of binaries.

🔗 Training & Evaluation Code
GitHub repository: https://github.com/red-archh/cyborg-b144-hf-Model

🔎 Model Overview

Model Name: Cyborg-B144
Architecture: EfficientNet-B5
Framework: PyTorch
Image Size: 456×456
Classes: 5 malware threat categories
Training Dataset: Malimg (25 families regrouped into 5 super-families)

Threat Categories

Label	Category
0	Adware
1	Ransomware
2	Spyware
3	Trojan
4	Worm

The original 25 malware families from Malimg were mapped into five broader operational threat categories to better reflect real-world security taxonomy.

📊 Performance

Test Accuracy: 100.00%

Classification Report:

Precision: 1.0000 (all classes)
Recall: 1.0000 (all classes)
F1-score: 1.0000 (all classes)

A normalized confusion matrix shows perfect separation across the five threat categories.

⚠️ Important Context

The Malimg dataset is known to be structurally separable, especially when malware families are grouped into higher-level threat categories. The perfect test accuracy reflects dataset characteristics and benchmark saturation rather than real-world zero-day robustness.

This model is intended for research and educational purposes.

🧠 Training Details

Full fine-tuning (no frozen backbone)
AdamW optimizer
Class-weighted cross-entropy
Mixed precision training (AMP)
Separate validation and test splits
Verified zero file overlap between splits

🖼️ Input Format

RGB image
Resized to 456×456
Normalized using ImageNet mean and std

🚀 Inference Example (PyTorch)

from inference import load_model, predict_image
from PIL import Image

model = load_model("cyborgb144_best.pth")
image = Image.open("sample.png").convert("RGB")

result = predict_image(image, model)
print(result)

🌐 Hugging Face Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/KarthikRaj666/cyborg-b144"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

with open("sample.png", "rb") as f:
    response = requests.post(API_URL, headers=headers, data=f)

print(response.json())

Output format

{
    "label" : "worm",
    "confidence" : 0.9987

}

⚖️ Limitations

Trained on a single benchmark dataset (Malimg).
Does not detect executable files directly.
Does not analyze runtime behavior.
Not evaluated on obfuscated or zero-day malware.
Not intended for production antivirus deployment.

🔭 Roadmap

Cyborg-B144 is the first release in the Cyborg model series.

Future planned release :

Cyborg-B256

EfficientNet-B7 backbone
Multi-dataset training
Improved cross-dataset robustness
Enhanced threat generalization

🛡️ Ethical Considerations

This model operates on static malware visualizations and does not interact with executable binaries. It is designed strictly for academic and research purposes.

Users are responsible for ensuring safe and ethical usage.

📜 License

Apache-2.0

👤 Author

Karthik Raj Panuganti

Developed as part of an independent research initiative exploring vision-based malware classification.

Downloads last month: 2