PEFT
Safetensors
English
cybersecurity
malware-analysis
att&ck
threat-intelligence
mixtral
lora
expert-adapters
cape-sandbox
digital-forensics
Instructions to use umer07/fathom-mixtral with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use umer07/fathom-mixtral with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B-Instruct-v0.1") model = PeftModel.from_pretrained(base_model, "umer07/fathom-mixtral") - Notebooks
- Google Colab
- Kaggle
update
Browse files
README.md
CHANGED
|
@@ -1,175 +1,190 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
-
|
| 7 |
-
-
|
| 8 |
-
-
|
| 9 |
-
-
|
| 10 |
-
-
|
| 11 |
-
-
|
| 12 |
-
-
|
| 13 |
-
-
|
| 14 |
-
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
**
|
| 25 |
-
**
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
**
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
- **
|
| 38 |
-
- **
|
| 39 |
-
- **
|
| 40 |
-
- **
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
-
|
| 45 |
-
-
|
| 46 |
-
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
|
| 58 |
-
|
|
| 59 |
-
|
|
| 60 |
-
| MMLU
|
| 61 |
-
|
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
|
| 77 |
-
|
|
| 78 |
-
| `
|
| 79 |
-
| `expert-
|
| 80 |
-
| `expert-
|
| 81 |
-
| `expert-
|
| 82 |
-
| `expert-
|
| 83 |
-
| `expert-
|
| 84 |
-
| `expert-
|
| 85 |
-
| `expert-
|
| 86 |
-
| `expert-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
|
| 94 |
-
|
|
| 95 |
-
|
|
| 96 |
-
|
|
| 97 |
-
|
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
|
| 105 |
-
|
|
| 106 |
-
|
|
| 107 |
-
|
|
| 108 |
-
|
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
- **
|
| 116 |
-
- **MMLU
|
| 117 |
-
- **
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
---
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
--
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
```
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: cc-by-nc-4.0
|
| 4 |
+
tags:
|
| 5 |
+
- cybersecurity
|
| 6 |
+
- malware-analysis
|
| 7 |
+
- att&ck
|
| 8 |
+
- threat-intelligence
|
| 9 |
+
- mixtral
|
| 10 |
+
- lora
|
| 11 |
+
- peft
|
| 12 |
+
- expert-adapters
|
| 13 |
+
- cape-sandbox
|
| 14 |
+
- digital-forensics
|
| 15 |
+
library_name: peft
|
| 16 |
+
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
|
| 17 |
+
inference: false
|
| 18 |
+
metrics:
|
| 19 |
+
- accuracy
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
# **Fathom** β Specialized Cybersecurity Analysis Model
|
| 23 |
+
|
| 24 |
+
**Mixtral-8x7B-Instruct-v0.1 + 10Γ LoRA adapters (rank=32, bf16)**
|
| 25 |
+
**Primary adapter:** `unified-v2` (general cybersecurity + malware analysis)
|
| 26 |
+
**9 expert adapters** for domain-specific routing (static/dynamic analysis, network, forensics, threat intel, etc.)
|
| 27 |
+
|
| 28 |
+
**Hugging Face Hub:** [`umer07/fathom-mixtral`](https://huggingface.co/umer07/fathom-mixtral)
|
| 29 |
+
**Datasets:** [`umer07/fathom-expert-data`](https://huggingface.co/datasets/umer07/fathom-expert-data)
|
| 30 |
+
|
| 31 |
+
**Fathom** turns raw sandbox reports (CAPE, Joe Sandbox, etc.) into high-quality ATT&CK-mapped malware analysis. It outperforms general-purpose models on cybersecurity tasks while remaining fully open-source and runnable on a single AMD MI300X / A100 80GB.
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
## Model Overview
|
| 36 |
+
|
| 37 |
+
- **Base:** Mixtral-8x7B-Instruct-v0.1 (full bf16, no quantization)
|
| 38 |
+
- **Training:** Direct PEFT+TRL (LlamaFactory dropped due to ROCm issues)
|
| 39 |
+
- **Adapters:** 1 unified + 9 expert LoRA adapters (all rank=32, Ξ±=16)
|
| 40 |
+
- **Hardware:** AMD MI300X (205.8 GB VRAM) β full bf16 training
|
| 41 |
+
- **Key Innovation:** Evidence extraction layer + structured behavioral prompts β **9Γ improvement** in real ATT&CK mapping
|
| 42 |
+
|
| 43 |
+
**Designed for:**
|
| 44 |
+
- Malware analysts & threat hunters
|
| 45 |
+
- SOC / DFIR teams
|
| 46 |
+
- CAPE / sandbox report enrichment
|
| 47 |
+
- Automated ATT&CK technique extraction
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
+
|
| 51 |
+
## Benchmark Results
|
| 52 |
+
|
| 53 |
+
All results use the **real Fathom pipeline** (`[INST]` chat template + 8192 context + structured evidence from CAPE extraction layer v3). Greedy decoding, bf16.
|
| 54 |
+
|
| 55 |
+
### 1. General Cybersecurity Knowledge (vs. Closed & Open Models)
|
| 56 |
+
|
| 57 |
+
| Benchmark | Fathom unified-v2 | GPT-4 (ref) | GPT-3.5 (ref) | Base Mixtral-8x7B | Llama-2-70B (ref) |
|
| 58 |
+
|----------------------------|-------------------|-------------|---------------|-------------------|-------------------|
|
| 59 |
+
| **CyberMetric-80** | **91.25%** | ~87% | ~67% | 82.5% | ~57% |
|
| 60 |
+
| MMLU Computer Security | **79.0%** | ~82% | ~65% | β | ~54% |
|
| 61 |
+
| MMLU Security Studies | **64.0%** | ~74% | ~60% | β | ~48% |
|
| 62 |
+
| TruthfulQA MC1 | **65.0%** | | | | |
|
| 63 |
+
|
| 64 |
+
**Visual bar comparison (CyberMetric-80):**
|
| 65 |
+
|
| 66 |
+
```
|
| 67 |
+
Fathom unified-v2 ββββββββββββββββββββ 91.25%
|
| 68 |
+
GPT-4 ββββββββββββββββββ ~87%
|
| 69 |
+
Base Mixtral βββββββββββββββββ 82.5%
|
| 70 |
+
GPT-3.5 ββββββββββββββ ~67%
|
| 71 |
+
Llama-2-70B ββββββββββββ ~57%
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
### 2. Expert Adapter Comparison (CyberMetric-80)
|
| 75 |
+
|
| 76 |
+
| Adapter | Score | Specialty |
|
| 77 |
+
|--------------------------|---------|------------------------------------|
|
| 78 |
+
| `unified-v2` | **91.25%** | All-domain baseline |
|
| 79 |
+
| `expert-e8-analyst` | **91.25%** | Analyst Q&A & reporting |
|
| 80 |
+
| `expert-e3-network` | 90.00% | Network traffic / C2 analysis |
|
| 81 |
+
| `expert-e4-forensics` | 90.00% | Memory & disk forensics |
|
| 82 |
+
| `expert-e6-detection` | 88.75% | Detection engineering |
|
| 83 |
+
| `expert-e7-reports` | 88.75% | Structured report generation |
|
| 84 |
+
| `expert-e2-dynamic` | 85.00% | Behavioral / sandbox analysis |
|
| 85 |
+
| `expert-e1-static` | 83.75% | Static PE + evasion detection |
|
| 86 |
+
| `expert-e9-cot` | 87.50% | Chain-of-thought reasoning |
|
| 87 |
+
| `expert-e5-threatintel` | 81.25% | Threat intel & actor profiling |
|
| 88 |
+
|
| 89 |
+
### 3. Core Contribution: Real ATT&CK Mapping Accuracy
|
| 90 |
+
|
| 91 |
+
**Progression table** (same model weights, only input pipeline improved):
|
| 92 |
+
|
| 93 |
+
| Configuration | Exact F1 | Parent F1 | Improvement |
|
| 94 |
+
|----------------------------------------|----------|-----------|-------------|
|
| 95 |
+
| Raw API list (naive) | 0.083 | 0.095 | β |
|
| 96 |
+
| Structured prompt (manual) | 0.370 | 0.429 | +0.334 |
|
| 97 |
+
| Real Fathom evidence layer | 0.534 | 0.508 | +0.413 |
|
| 98 |
+
| **Real pipeline + full context fix** | **0.868**| **0.841** | **+0.746** |
|
| 99 |
+
|
| 100 |
+
**This proves the architecture (evidence extraction + structured prompts) matters more than additional fine-tuning.**
|
| 101 |
+
|
| 102 |
+
### 4. Real Malware Analysis β CAPE Pipeline ( malscore 10/10 samples)
|
| 103 |
+
|
| 104 |
+
| Sample | Family | GT T-codes | Predicted T-codes | Exact F1 | Parent F1 | Family ID |
|
| 105 |
+
|--------|----------|-----------------------------|--------------------------------------------|----------|-----------|-----------|
|
| 106 |
+
| 12 | Emotet | T1012, T1071, T1071.004, T1083 | T1012, T1055, T1071, T1071.004, T1083 | 0.889 | 0.857 | 100% conf |
|
| 107 |
+
| 15 | Formbook | T1012, T1055, T1071, T1071.004, T1083 | T1003, T1012, T1027.002, T1055, T1059, T1071, T1071.004, T1083, T1497 | 0.714 | 0.667 | 85% conf |
|
| 108 |
+
| 16 | Dridex | T1012, T1055, T1071, T1071.004, T1083 | T1012, T1055, T1071, T1071.004, T1083 | **1.000**| **1.000** | 68% conf |
|
| 109 |
+
| **Average** | | | | **0.868**| **0.841** | β |
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
|
| 113 |
+
### 5. Additional Benchmarks
|
| 114 |
+
|
| 115 |
+
- **ATT&CK Mapping MCQ (30 handcrafted questions):** 80%
|
| 116 |
+
- **MMLU Machine Learning:** 60%
|
| 117 |
+
- **MMLU Electrical Engineering:** 64%
|
| 118 |
+
- **Rigorous ground-truth F1 (23 test cases):** Exact = 0.184, Parent = 0.344 (synthetic); real CAPE = 0.841 after pipeline fixes
|
| 119 |
+
|
| 120 |
+
### 5. Key Discovery: Mal-API-2019 Analysis
|
| 121 |
+
|
| 122 |
+
We evaluated Fathom on the public **Mal-API-2019** dataset (Catak & YazΔ±, arXiv:1905.01999) β 7,107 API call sequences from Cuckoo Sandbox.
|
| 123 |
+
|
| 124 |
+
| Variant | Accuracy | Macro F1 |
|
| 125 |
+
|--------------------------|----------|----------|
|
| 126 |
+
| Raw API sequences | 12.6% | 0.030 |
|
| 127 |
+
| Filtered behavioral groups | 10.9% | 0.052 |
|
| 128 |
+
|
| 129 |
+
### Insight:
|
| 130 |
+
|
| 131 |
+
Raw API sequences alone are insufficient for reliable family classification. The dataset contains heavy loader noise and families share nearly identical behavioral APIs. Ground-truth labels come from static AV signatures, not behavioral semantics.
|
| 132 |
+
> β In contrast, Fathomβs full evidence extraction pipeline achieves 0.841 Parent F1 on real CAPEv2 reports. This demonstrates that structured behavioral evidence + multi-source context (not raw API text) is the critical enabler for production-grade malware analysis.β
|
| 133 |
+
|
| 134 |
+
---
|
| 135 |
+
|
| 136 |
+
## How to Use
|
| 137 |
+
|
| 138 |
+
### Loading the unified model (recommended for most users)
|
| 139 |
+
|
| 140 |
+
```python
|
| 141 |
+
from peft import PeftModel
|
| 142 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 143 |
+
import torch
|
| 144 |
+
|
| 145 |
+
model_name = "mistralai/Mixtral-8x7B-Instruct-v0.1"
|
| 146 |
+
adapter = "umer07/fathom-mixtral" # unified-v2 at root
|
| 147 |
+
|
| 148 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 149 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 150 |
+
model_name,
|
| 151 |
+
torch_dtype=torch.bfloat16,
|
| 152 |
+
device_map="auto",
|
| 153 |
+
trust_remote_code=True
|
| 154 |
+
)
|
| 155 |
+
model = PeftModel.from_pretrained(model, adapter, adapter_name="unified-v2")
|
| 156 |
+
model.eval()
|
| 157 |
+
```
|
| 158 |
+
|
| 159 |
+
|
| 160 |
+
---
|
| 161 |
+
|
| 162 |
+
## Limitations
|
| 163 |
+
|
| 164 |
+
- Sub-technique precision lower than parent techniques (standard across all LLMs)
|
| 165 |
+
- Family identification improves significantly with KSPN enrichment
|
| 166 |
+
- Rare/exotic TTPs (UAC bypass, ICMP C2) have low recall
|
| 167 |
+
- Prompt injection / attribution hallucination remains a base-model weakness (mitigable with system prompt hardening)
|
| 168 |
+
|
| 169 |
+
|
| 170 |
+
---
|
| 171 |
+
|
| 172 |
+
## Training & Datasets
|
| 173 |
+
|
| 174 |
+
- **Unified-v2:** 123,912 rows (1 epoch)
|
| 175 |
+
- **Experts:** 9 specialized datasets (total > 200k rows after augmentation)
|
| 176 |
+
- **Evasive dataset (NEW):** 25,160 obfuscated C++ samples (92 evasion combinations)
|
| 177 |
+
- **ThreatIntel upgrade:** 9,532 rows (URLhaus + GTFOBins + MITRE CTI)
|
| 178 |
+
|
| 179 |
+
---
|
| 180 |
+
|
| 181 |
+
## Citation
|
| 182 |
+
|
| 183 |
+
```bibtex
|
| 184 |
+
@misc{fathom2026,
|
| 185 |
+
title={Fathom: Expert Cybersecurity Analysis with Mixtral LoRA Adapters},
|
| 186 |
+
author={Umer},
|
| 187 |
+
year={2026},
|
| 188 |
+
howpublished={\url{https://huggingface.co/umer07/fathom-mixtral}},
|
| 189 |
+
}
|
| 190 |
```
|