VERIS Classifier v2

A fine-tuned Mistral-7B-Instruct-v0.3 model that classifies cybersecurity incident descriptions into the VERIS (Vocabulary for Event Recording and Incident Sharing) framework.

Given a plain-English incident description, the model outputs structured JSON with the correct VERIS categories for action, actor, asset, and attribute.

Try the live demo — no API key required, runs on ZeroGPU.

Example

Input:

An employee at a hospital clicked a phishing email, which installed ransomware that encrypted patient records.

Output:

{
  "action": {"hacking": {"variety": ["Ransomware"]}, "social": {"variety": ["Phishing"]}},
  "actor": {"external": {"variety": ["Unaffiliated"], "motive": ["Financial"]}},
  "asset": {"assets": [{"variety": "S - Database"}]},
  "attribute": {"availability": {"variety": ["Obscuration"]}}
}

Training Details

Parameter	Value
Base model	mistralai/Mistral-7B-Instruct-v0.3
Method	QLoRA (4-bit NF4 quantization + LoRA)
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.05
Target modules	All linear (q, k, v, o, gate, up, down)
Training examples	9,813 train / 517 eval
Epochs	3
Batch size	2 x 4 gradient accumulation = 8 effective
Learning rate	2e-4 (cosine schedule, 10% warmup)
Precision	bf16
Optimizer	AdamW
Max sequence length	2,048 tokens
Hardware	NVIDIA A10G (24GB VRAM)

Training Data

Fine-tuned on vibesecurityguy/veris-classifier-training, which contains:

10,019 classification examples — synthetic incident descriptions generated from real VCDB (Verizon Community Database) records, paired with their ground-truth VERIS classifications
311 Q&A pairs — questions and answers about the VERIS framework itself

The source classifications come from 8,559 real-world incidents in VCDB, spanning healthcare, finance, retail, government, and other industries.

How to Use

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "mistralai/Mistral-7B-Instruct-v0.3"
adapter = "vibesecurityguy/veris-classifier-v2"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

messages = [
    {"role": "system", "content": "You are a VERIS classification expert..."},
    {"role": "user", "content": "Classify this incident: An employee lost a laptop containing unencrypted customer data."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Use

This model is designed for:

Classifying cybersecurity incidents into the VERIS framework
Answering questions about VERIS categories and taxonomy
Assisting incident response teams with structured data entry

Limitations

VCDB bias: Training data over-represents healthcare (HIPAA mandatory disclosure) and US-based incidents
Schema version: Trained primarily on VERIS 1.3.x schema; may not cover all 1.4 additions
Not a replacement for human analysis: Output should be reviewed by a security analyst
English only: Trained on English-language incident descriptions

Model Card Authors

Peter Shamoon (@vibesecurityguy)

Downloads last month: 2

Model tree for vibesecurityguy/veris-classifier-v2

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Adapter

(824)

this model

vibesecurityguy
/

veris-classifier-v2