Cyborg-B144: Vision-Based Malware Threat Classification

Cyborg-B144 is a vision-based malware classification model built on EfficientNet-B5 and trained on grayscale malware binary visualizations. The model predicts high-level malware threat categories from visual representations of binaries.


πŸ”— Training & Evaluation Code
GitHub repository: https://github.com/red-archh/cyborg-b144-hf-Model


πŸ”Ž Model Overview

  • Model Name: Cyborg-B144
  • Architecture: EfficientNet-B5
  • Framework: PyTorch
  • Image Size: 456Γ—456
  • Classes: 5 malware threat categories
  • Training Dataset: Malimg (25 families regrouped into 5 super-families)

Threat Categories

Label Category
0 Adware
1 Ransomware
2 Spyware
3 Trojan
4 Worm

The original 25 malware families from Malimg were mapped into five broader operational threat categories to better reflect real-world security taxonomy.


πŸ“Š Performance

Test Accuracy: 100.00%

Classification Report:

  • Precision: 1.0000 (all classes)
  • Recall: 1.0000 (all classes)
  • F1-score: 1.0000 (all classes)

A normalized confusion matrix shows perfect separation across the five threat categories.

⚠️ Important Context

The Malimg dataset is known to be structurally separable, especially when malware families are grouped into higher-level threat categories. The perfect test accuracy reflects dataset characteristics and benchmark saturation rather than real-world zero-day robustness.

This model is intended for research and educational purposes.


🧠 Training Details

  • Full fine-tuning (no frozen backbone)
  • AdamW optimizer
  • Class-weighted cross-entropy
  • Mixed precision training (AMP)
  • Separate validation and test splits
  • Verified zero file overlap between splits

πŸ–ΌοΈ Input Format

  • RGB image
  • Resized to 456Γ—456
  • Normalized using ImageNet mean and std

πŸš€ Inference Example (PyTorch)

from inference import load_model, predict_image
from PIL import Image

model = load_model("cyborgb144_best.pth")
image = Image.open("sample.png").convert("RGB")

result = predict_image(image, model)
print(result)

🌐 Hugging Face Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/KarthikRaj666/cyborg-b144"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

with open("sample.png", "rb") as f:
    response = requests.post(API_URL, headers=headers, data=f)

print(response.json())

Output format

{
    "label" : "worm",
    "confidence" : 0.9987

}

βš–οΈ Limitations

  • Trained on a single benchmark dataset (Malimg).

  • Does not detect executable files directly.

  • Does not analyze runtime behavior.

  • Not evaluated on obfuscated or zero-day malware.

  • Not intended for production antivirus deployment.


πŸ”­ Roadmap

Cyborg-B144 is the first release in the Cyborg model series.

Future planned release :

Cyborg-B256

  • EfficientNet-B7 backbone

  • Multi-dataset training

  • Improved cross-dataset robustness

  • Enhanced threat generalization


πŸ›‘οΈ Ethical Considerations

This model operates on static malware visualizations and does not interact with executable binaries. It is designed strictly for academic and research purposes.

Users are responsible for ensuring safe and ethical usage.


πŸ“œ License

Apache-2.0


πŸ‘€ Author

Karthik Raj Panuganti

Developed as part of an independent research initiative exploring vision-based malware classification.

Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support