CVE Analyst -- QLoRA Fine-tuned Ministral-3-3B-Instruct-2512-BF16

Fine-tuned mistralai/Ministral-3-3B-Instruct-2512-BF16 on the AlicanKiraz0/All-CVE-Records-Training-Dataset dataset using QLoRA (Parameter-Efficient Fine-Tuning with Low-Rank Adaptation).

Model Description

This adapter specialises the base model for CVE vulnerability analysis -- given a vulnerability identifier and context, the model produces structured technical analyses including exploitation vectors, impact assessment, and remediation strategies.

Property	Value
Base model	`mistralai/Ministral-3-3B-Instruct-2512-BF16`
Method	QLoRA
LoRA rank	16
LoRA alpha	32
Training epochs	3
Learning rate	0.0002
Effective batch size	12
Max sequence length	4096

Training Data

Dataset: AlicanKiraz0/All-CVE-Records-Training-Dataset

Attribution: Dataset created by AlicanKiraz0.

License: Apache 2.0

Sample size: 10000 rows (filtered to fit within 4096 tokens).

Split	Count
train	8000
val	1000
test	1000

Training Procedure

Hardware: NVIDIA RTX 4070 (8 GB VRAM)
Training time: 33h 22m 55s
Framework: HuggingFace Transformers + PEFT + bitsandbytes

Evaluation Results

Before vs After

Metric	Original	Fine-tuned
bleu	4.53	32.87
rouge1	19.8	34.33
rouge2	4.39	21.38
rougeL	10.77	30.71
perplexity	5.11	2.01

CVE-ID Extraction

Metric	Original	Fine-tuned
precision	79.37	100.0
recall	92.59	92.59
f1	85.47	96.15

Limitations

The model was trained on a subset (10000 samples) of the full dataset; coverage of less common CVE types may be limited.
Maximum sequence length during training was 4096 tokens; very long analyses will be truncated.
The model inherits the base model's biases and limitations.
Responses should be reviewed by a qualified security professional before being used in production or advisory contexts.

How to Use

from peft import PeftModel
from transformers import Mistral3ForConditionalGeneration, MistralCommonBackend

# Load base model and tokenizer
base = Mistral3ForConditionalGeneration.from_pretrained(
    "mistralai/Ministral-3-3B-Instruct-2512-BF16",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = MistralCommonBackend.from_pretrained(
    "mistralai/Ministral-3-3B-Instruct-2512-BF16"
)

# Load adapter on top
model = PeftModel.from_pretrained(base, "noman-asif/CVE-Analyst-Ministral-3B-QLoRA")

# Generate
messages = [{"role": "user", "content": "Provide a comprehensive technical analysis of CVE-2024-1234."}]
inputs = tokenizer.apply_chat_template(messages, return_dict=True, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(**inputs.to(model.device), max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License

This model and its adapter weights are released under the Apache 2.0 License.

Downloads last month: 6

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for noman-asif/CVE-Analyst-Ministral-3B-QLoRA

Base model

mistralai/Ministral-3-3B-Base-2512

Finetuned

mistralai/Ministral-3-3B-Instruct-2512-BF16

Adapter

(17)

this model

noman-asif
/

CVE-Analyst-Ministral-3B-QLoRA