fathom-mixtral / README.md
umer07's picture
Update model card with full Plan A details and results
b240494 verified
metadata
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
library_name: peft
tags:
  - cybersecurity
  - malware-analysis
  - peft
  - lora
  - qlora
  - mixtral
language:
  - en
pipeline_tag: text-generation
license: apache-2.0

Fathom Plan A LoRA Adapter (Mixtral-8x7B-Instruct)

This repository contains the Plan A LoRA adapter for the Fathom FYP project:

"Fathom: An LLM-Powered Automated Malware Analysis Framework"

The adapter is trained on a curated cybersecurity instruction-tuning corpus to improve analyst-style security outputs over the base mistralai/Mixtral-8x7B-Instruct-v0.1 model.

What This Is

  • Type: PEFT LoRA adapter (not a full standalone model)
  • Base model required: mistralai/Mixtral-8x7B-Instruct-v0.1
  • Training style: QLoRA (4-bit NF4 base loading, bf16 compute)
  • Scope: Plan A MVP uplift for cybersecurity and malware-analysis assistance

Key Training Setup

  • Sequence length: 2048
  • Batch: 2
  • Gradient accumulation: 8 (effective 16)
  • Learning rate: 2e-4 (cosine scheduler)
  • Steps: 3000 (completed run)
  • LoRA rank/alpha: r=32, alpha=64
  • LoRA targets: q_proj, k_proj, v_proj, o_proj (attention-only)
  • Optimizer: paged_adamw_8bit
  • Precision: bf16

Hardware Used

Training was run on RunPod:

  • GPU: NVIDIA A100 PCIe 80GB (1x)
  • vCPU: 8
  • RAM: 125 GB
  • Disk: 200 GB
  • Location: CA

Data Summary

Curated cybersecurity instruction corpus with mixed sources (CyberMetric, Trendyol CyberSec, ShareGPT Cybersecurity, NIST downsampled, MITRE ATT&CK, CVE/IR/malware-focused sets).

Final working files used:

  • train.jsonl: 120,912 samples
  • eval.jsonl: 1,915 samples
  • cybermetric_80.jsonl: 80 held-out MCQs
  • malware_eval_25.jsonl: 25 expert malware prompts

Evaluation Results

Standard post-eval settings

Generation settings used for fair base-vs-adapter comparison:

  • do_sample=False
  • temperature=0.0
  • max_new_eval=64
  • max_new_cyber=48
  • max_new_malware=256

Baseline (corrected) vs Fine-tuned

Metric Baseline Fine-tuned Delta
Eval mean overlap 0.3283 0.3631 +0.0349
Eval exact match rate 0.0000 0.2193 +0.2193
CyberMetric-80 accuracy 0.825 0.900 +0.075
Malware structure 0.44 0.84 +0.40
Malware ATT&CK correctness 0.16 0.20 +0.04
Malware reasoning 0.24 0.20 -0.04
Malware evidence awareness 0.48 0.52 +0.04
Malware analyst usefulness 0.52 0.56 +0.04

Malware-only rerun with longer output budget

To test truncation effects on malware prompts, both base and fine-tuned were rerun with max_new_malware=512 (25 prompts only).

Rubric axis Base (512) Fine-tuned (512) Delta
Structure 0.56 0.88 +0.32
ATT&CK correctness 0.16 0.20 +0.04
Malware reasoning 0.36 0.28 -0.08
Evidence awareness 0.56 0.64 +0.08
Analyst usefulness 0.64 0.80 +0.16

Interpretation: structure/evidence/usefulness improved strongly, but malware reasoning remains the main gap for future iterations.

Limitations

  • This is a Plan A MVP adapter, not a fully specialized malware reverse-engineering model.
  • Malware causal reasoning still needs improvement via targeted data and/or evidence-grounded training (Plan B).
  • Outputs should be treated as analyst assistance, not an autonomous verdict.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

base_model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
adapter_repo = "umer07/fathom-mixtral-lora-plan-a"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_id, use_fast=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    device_map={"": 0},
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
)

model = PeftModel.from_pretrained(model, adapter_repo)
model.eval()

prompt = """### Instruction:
Analyze the malware behavior and map likely ATT&CK techniques.

### Input:
Sample creates scheduled task persistence and launches encoded PowerShell.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
    out = model.generate(**inputs, max_new_tokens=512, do_sample=False, temperature=0.0)

print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Project Status

  • Core Plan A training/evaluation cycle: completed
  • GPU instance used for training has been deleted
  • No additional training is currently in progress

Citation

If you use this adapter, please cite your project report/thesis for Fathom Plan A and reference the base model (mistralai/Mixtral-8x7B-Instruct-v0.1).