fathom-mixtral / README.md
umer07's picture
Update model card with full Plan A details and results
b240494 verified
---
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
library_name: peft
tags:
- cybersecurity
- malware-analysis
- peft
- lora
- qlora
- mixtral
language:
- en
pipeline_tag: text-generation
license: apache-2.0
---
# Fathom Plan A LoRA Adapter (Mixtral-8x7B-Instruct)
This repository contains the **Plan A** LoRA adapter for the Fathom FYP project:
**"Fathom: An LLM-Powered Automated Malware Analysis Framework"**
The adapter is trained on a curated cybersecurity instruction-tuning corpus to improve analyst-style security outputs over the base `mistralai/Mixtral-8x7B-Instruct-v0.1` model.
## What This Is
- **Type:** PEFT LoRA adapter (not a full standalone model)
- **Base model required:** `mistralai/Mixtral-8x7B-Instruct-v0.1`
- **Training style:** QLoRA (4-bit NF4 base loading, bf16 compute)
- **Scope:** Plan A MVP uplift for cybersecurity and malware-analysis assistance
## Key Training Setup
- **Sequence length:** 2048
- **Batch:** 2
- **Gradient accumulation:** 8 (effective 16)
- **Learning rate:** 2e-4 (cosine scheduler)
- **Steps:** 3000 (completed run)
- **LoRA rank/alpha:** r=32, alpha=64
- **LoRA targets:** `q_proj`, `k_proj`, `v_proj`, `o_proj` (attention-only)
- **Optimizer:** paged_adamw_8bit
- **Precision:** bf16
## Hardware Used
Training was run on RunPod:
- **GPU:** NVIDIA A100 PCIe 80GB (1x)
- **vCPU:** 8
- **RAM:** 125 GB
- **Disk:** 200 GB
- **Location:** CA
## Data Summary
Curated cybersecurity instruction corpus with mixed sources (CyberMetric, Trendyol CyberSec, ShareGPT Cybersecurity, NIST downsampled, MITRE ATT&CK, CVE/IR/malware-focused sets).
Final working files used:
- `train.jsonl`: 120,912 samples
- `eval.jsonl`: 1,915 samples
- `cybermetric_80.jsonl`: 80 held-out MCQs
- `malware_eval_25.jsonl`: 25 expert malware prompts
## Evaluation Results
### Standard post-eval settings
Generation settings used for fair base-vs-adapter comparison:
- `do_sample=False`
- `temperature=0.0`
- `max_new_eval=64`
- `max_new_cyber=48`
- `max_new_malware=256`
#### Baseline (corrected) vs Fine-tuned
| Metric | Baseline | Fine-tuned | Delta |
|---|---:|---:|---:|
| Eval mean overlap | 0.3283 | 0.3631 | +0.0349 |
| Eval exact match rate | 0.0000 | 0.2193 | +0.2193 |
| CyberMetric-80 accuracy | 0.825 | 0.900 | +0.075 |
| Malware structure | 0.44 | 0.84 | +0.40 |
| Malware ATT&CK correctness | 0.16 | 0.20 | +0.04 |
| Malware reasoning | 0.24 | 0.20 | -0.04 |
| Malware evidence awareness | 0.48 | 0.52 | +0.04 |
| Malware analyst usefulness | 0.52 | 0.56 | +0.04 |
### Malware-only rerun with longer output budget
To test truncation effects on malware prompts, both base and fine-tuned were rerun with `max_new_malware=512` (25 prompts only).
| Rubric axis | Base (512) | Fine-tuned (512) | Delta |
|---|---:|---:|---:|
| Structure | 0.56 | 0.88 | +0.32 |
| ATT&CK correctness | 0.16 | 0.20 | +0.04 |
| Malware reasoning | 0.36 | 0.28 | -0.08 |
| Evidence awareness | 0.56 | 0.64 | +0.08 |
| Analyst usefulness | 0.64 | 0.80 | +0.16 |
Interpretation: structure/evidence/usefulness improved strongly, but malware reasoning remains the main gap for future iterations.
## Limitations
- This is a **Plan A MVP adapter**, not a fully specialized malware reverse-engineering model.
- Malware causal reasoning still needs improvement via targeted data and/or evidence-grounded training (Plan B).
- Outputs should be treated as analyst assistance, not an autonomous verdict.
## Usage
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
base_model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
adapter_repo = "umer07/fathom-mixtral-lora-plan-a"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id, use_fast=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
quantization_config=bnb_config,
device_map={"": 0},
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
)
model = PeftModel.from_pretrained(model, adapter_repo)
model.eval()
prompt = """### Instruction:
Analyze the malware behavior and map likely ATT&CK techniques.
### Input:
Sample creates scheduled task persistence and launches encoded PowerShell.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(**inputs, max_new_tokens=512, do_sample=False, temperature=0.0)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```
## Project Status
- Core Plan A training/evaluation cycle: **completed**
- GPU instance used for training has been deleted
- No additional training is currently in progress
## Citation
If you use this adapter, please cite your project report/thesis for Fathom Plan A and reference the base model (`mistralai/Mixtral-8x7B-Instruct-v0.1`).