Fathom Plan A LoRA Adapter (Mixtral-8x7B-Instruct)
This repository contains the Plan A LoRA adapter for the Fathom FYP project:
"Fathom: An LLM-Powered Automated Malware Analysis Framework"
The adapter is trained on a curated cybersecurity instruction-tuning corpus to improve analyst-style security outputs over the base mistralai/Mixtral-8x7B-Instruct-v0.1 model.
What This Is
- Type: PEFT LoRA adapter (not a full standalone model)
- Base model required:
mistralai/Mixtral-8x7B-Instruct-v0.1 - Training style: QLoRA (4-bit NF4 base loading, bf16 compute)
- Scope: Plan A MVP uplift for cybersecurity and malware-analysis assistance
Key Training Setup
- Sequence length: 2048
- Batch: 2
- Gradient accumulation: 8 (effective 16)
- Learning rate: 2e-4 (cosine scheduler)
- Steps: 3000 (completed run)
- LoRA rank/alpha: r=32, alpha=64
- LoRA targets:
q_proj,k_proj,v_proj,o_proj(attention-only) - Optimizer: paged_adamw_8bit
- Precision: bf16
Hardware Used
Training was run on RunPod:
- GPU: NVIDIA A100 PCIe 80GB (1x)
- vCPU: 8
- RAM: 125 GB
- Disk: 200 GB
- Location: CA
Data Summary
Curated cybersecurity instruction corpus with mixed sources (CyberMetric, Trendyol CyberSec, ShareGPT Cybersecurity, NIST downsampled, MITRE ATT&CK, CVE/IR/malware-focused sets).
Final working files used:
train.jsonl: 120,912 sampleseval.jsonl: 1,915 samplescybermetric_80.jsonl: 80 held-out MCQsmalware_eval_25.jsonl: 25 expert malware prompts
Evaluation Results
Standard post-eval settings
Generation settings used for fair base-vs-adapter comparison:
do_sample=Falsetemperature=0.0max_new_eval=64max_new_cyber=48max_new_malware=256
Baseline (corrected) vs Fine-tuned
| Metric | Baseline | Fine-tuned | Delta |
|---|---|---|---|
| Eval mean overlap | 0.3283 | 0.3631 | +0.0349 |
| Eval exact match rate | 0.0000 | 0.2193 | +0.2193 |
| CyberMetric-80 accuracy | 0.825 | 0.900 | +0.075 |
| Malware structure | 0.44 | 0.84 | +0.40 |
| Malware ATT&CK correctness | 0.16 | 0.20 | +0.04 |
| Malware reasoning | 0.24 | 0.20 | -0.04 |
| Malware evidence awareness | 0.48 | 0.52 | +0.04 |
| Malware analyst usefulness | 0.52 | 0.56 | +0.04 |
Malware-only rerun with longer output budget
To test truncation effects on malware prompts, both base and fine-tuned were rerun with max_new_malware=512 (25 prompts only).
| Rubric axis | Base (512) | Fine-tuned (512) | Delta |
|---|---|---|---|
| Structure | 0.56 | 0.88 | +0.32 |
| ATT&CK correctness | 0.16 | 0.20 | +0.04 |
| Malware reasoning | 0.36 | 0.28 | -0.08 |
| Evidence awareness | 0.56 | 0.64 | +0.08 |
| Analyst usefulness | 0.64 | 0.80 | +0.16 |
Interpretation: structure/evidence/usefulness improved strongly, but malware reasoning remains the main gap for future iterations.
Limitations
- This is a Plan A MVP adapter, not a fully specialized malware reverse-engineering model.
- Malware causal reasoning still needs improvement via targeted data and/or evidence-grounded training (Plan B).
- Outputs should be treated as analyst assistance, not an autonomous verdict.
Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
base_model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
adapter_repo = "umer07/fathom-mixtral-lora-plan-a"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id, use_fast=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
quantization_config=bnb_config,
device_map={"": 0},
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
)
model = PeftModel.from_pretrained(model, adapter_repo)
model.eval()
prompt = """### Instruction:
Analyze the malware behavior and map likely ATT&CK techniques.
### Input:
Sample creates scheduled task persistence and launches encoded PowerShell.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(**inputs, max_new_tokens=512, do_sample=False, temperature=0.0)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Project Status
- Core Plan A training/evaluation cycle: completed
- GPU instance used for training has been deleted
- No additional training is currently in progress
Citation
If you use this adapter, please cite your project report/thesis for Fathom Plan A and reference the base model (mistralai/Mixtral-8x7B-Instruct-v0.1).
- Downloads last month
- 2
Model tree for umer07/fathom-mixtral
Base model
mistralai/Mixtral-8x7B-v0.1