How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="dolutech/MinimoSec-V4-4B-GGUF",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

🛡️ MinimoSec V4

Fine-Tuned Cybersecurity LLM — Gemma 4 E4B

Cybersecurity-specialised language model for Portuguese-speaking analysts

Base Model Fine-Tuning Dataset Language Domain

HuggingFace GGUF Website

---

📌 Model Description

MinimoSec V4 is a cybersecurity-specialised language model fine-tuned from Google Gemma 4 E4B using supervised fine-tuning (SFT) with Low-Rank Adaptation (LoRA) via the Unsloth framework.

The model was trained on 22,571 Portuguese-language cybersecurity examples covering threat analysis, malware identification, MITRE ATT&CK mapping, YARA rule generation, IOC extraction, and digital forensics. It is designed to assist security analysts, SOC teams, and researchers in Portuguese-speaking environments.

Specification Detail
Primary Language Portuguese (pt-PT / pt-BR)
Domain Cybersecurity, Threat Intelligence, Digital Forensics
Base Model google/gemma-4-e4b-it
Training Epochs 1 (V4-final with 3 epochs in development)
Quantisation Available Q4_K_M GGUF (~5.3 GB)

📊 CyberBench-Hard v1.0

Specialized Cybersecurity Benchmark for Small-Scale SFT Models


About the Benchmark

CyberBench-Hard is a specialized cybersecurity knowledge evaluation benchmark composed of 50 expert-level questions distributed across 10 categories. Questions are designed to test deep technical reasoning, factual accuracy, and hallucination resistance across critical information security domains.

This document presents partial results for categories D (Malware Analysis & Reverse Engineering) and G (MITRE ATT&CK & Threat Intelligence), evaluated on MinimoSec-V4-4B, a small-scale language model with specialized cybersecurity fine-tuning.


Evaluated Model

Field Detail
Model MinimoSec-V4-4B
Base Architecture Gemma 3 4B (4 billion parameters)
Fine-tuning SFT (Supervised Fine-Tuning)
Dataset 22,000 cybersecurity-focused samples
Specialization Offensive & Defensive Cybersecurity
Evaluator Lucas Catão de Moraes
Date April 2026
Methodology Manual per-dimension evaluation with weighted criteria

Evaluation Criteria

Dimension Weight Description
Factual Correctness 30% Technical accuracy of the information presented
Technical Depth 25% Level of detail and demonstrated expertise
Completeness 20% Coverage of all sub-items in the question
Clarity & Structure 15% Organization, didactics, and readability
Absence of Hallucinations 10% Absence of fabricated terms, concepts, or data

Scoring Scale

Score Classification
9.0 – 10.0 Expert-Level
7.5 – 8.9 Advanced
6.0 – 7.4 Intermediate
4.0 – 5.9 Basic
< 4.0 Insufficient

Category D — Malware Analysis & Reverse Engineering

# Topic Factual Depth Completeness Clarity Hallucinations Score Classification
D1 Static / Dynamic Analysis 6.0 5.5 6.0 7.5 6.0 6.10 Intermediate
D2 Packer / Crypter / Unpacking 5.0 4.5 3.5 7.5 5.5 5.00 Basic
D3 Process Hollowing (T1055.012) 7.0 6.0 5.5 8.0 6.5 6.55 Intermediate
D4 DKOM / Kernel Rootkit 7.0 6.5 7.0 8.5 7.0 7.10 Intermediate
D5 DGA / C2 / ML Detection 6.5 5.0 6.0 7.5 7.5 6.28 Intermediate
Category D Average 6.21 Intermediate

Category G — MITRE ATT&CK & Threat Intelligence

# Topic Factual Depth Completeness Clarity Hallucinations Score Classification
G1 MITRE ATT&CK Hierarchy 2.0 3.0 2.0 7.0 1.5 2.95 Insufficient
G2 IoCs vs IoAs / SIEM / SOAR 6.5 5.5 7.0 8.5 5.5 6.55 Intermediate
G3 Kill Chain / Diamond Model 5.5 4.5 5.5 8.0 4.0 5.48 Basic
G4 Threat Hunting / LOLBins 6.0 6.0 6.5 8.0 5.0 6.30 Intermediate
G5 STIX / TAXII 5.0 4.0 5.5 7.5 4.0 5.13 Basic
Category G Average 5.28 Basic

Overall Summary

Category Average Classification Best Response Worst Response
D — Malware & RE 6.21 Intermediate D4: DKOM / Rootkit (7.10) D2: Packer / Crypter (5.00)
G — MITRE & Threat Intel 5.28 Basic G2: IoCs vs IoAs (6.55) G1: MITRE ATT&CK (2.95)
Global Average (D + G) 5.74 Basic

Key Findings

  • Best overall response: D4 — DKOM / Kernel Rootkit (7.10 — Intermediate)
  • Worst overall response: G1 — MITRE ATT&CK Hierarchy (2.95 — Insufficient)
  • Strongest dimension: Clarity & Structure (average 7.75 across all 10 responses)
  • Weakest dimension: Absence of Hallucinations (average 4.85 across all 10 responses)
  • Highest internal variance: Category G (range from 2.95 to 6.55 = Δ3.60)

MinimoSec-V4-4B — Model Analysis

For a 4 billion parameter cybersecurity-specialized model, the CyberBench-Hard results reveal the following:

  1. SFT dataset quality is the determining factor. Category D (better training coverage) outperformed Category G by nearly 1 point, confirming that dataset curation matters more than model size alone. MinimoSec-V4-4B performs at Intermediate level in domains where its training data was strongest.

  2. The model excels at structure and clarity. The Clarity & Structure dimension scored between 7.0–8.5 across all responses, indicating that SFT successfully taught MinimoSec-V4-4B professional formatting and technical communication patterns.

  3. Factual accuracy and hallucinations are the primary limiters. MinimoSec-V4-4B tends to fabricate terms, IDs, and configurations when pushed beyond its training coverage, rather than expressing uncertainty. This is the most critical area for improvement.

  4. The observed performance ceiling for 4B + SFT is ~7.0. MinimoSec-V4-4B's best response scored 7.10 (DKOM / Kernel Rootkit). To reach Advanced classification (7.5+), recommended next steps include: scale-up of the base model, post-SFT alignment via DPO/RLHF, and expanded dataset curation with expert technical review.

  5. MinimoSec-V4-4B is suitable as an intermediate-level cybersecurity assistant for educational and study purposes in its well-trained domains, but should not be used as an authoritative technical reference without human verification.


Benchmark Reference

CyberBench-Hard v1.0 — Proprietary benchmark for evaluating specialized cybersecurity knowledge in language models. 50 expert-level questions across 10 categories. Developed and administered in April 2026.

Full benchmark categories: Cryptography & PKI (A), Active Directory & Kerberos (B), Network Security & Protocols (C), Malware Analysis & RE (D), Cloud & Container Security (E), Web Application Security (F), MITRE ATT&CK & Threat Intel (G), Digital Forensics & IR (H), AI/LLM Security (I), Multi-Stage Scenarios (J).

This document presents partial results for categories D and G (10 out of 50 questions). MinimoSec-V4-4B was evaluated on these categories as representative samples of its cybersecurity knowledge capabilities.


🚀 Quick Start

Ollama (Recommended)

ollama run hf.co/dolutech/MinimoSec-V4-GGUF:MinimoSec-V4.Q4_K_M.gguf

LM Studio

  1. Download MinimoSec-V4-4b.Q4_K_M.gguf from the GGUF repository
  2. Load it manually in LM Studio
  3. Note: Also download MinimoSec-V4-4b.BF16-mmproj.gguf for multimodal (vision) support

Python (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "dolutech/MinimoSec-V4-4B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Cria uma regra YARA para detetar ransomware que encripta ficheiros .docx e .xlsx."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=1.0, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

💬 Recommended System Prompt

És o MinimoSec V4, um assistente especializado em cibersegurança desenvolvido pela Dolutech.
Respondes sempre em Português de Portugal.
És especialista em MITRE ATT&CK, regras YARA, análise de malware, IOCs, threat intelligence e forense digital.
Forneces respostas técnicas, precisas e estruturadas.

📋 Training Details

Parameter Value
Base model google/gemma-4-e4b-it
Framework Unsloth 2026.4.5
Method SFT + LoRA
LoRA rank 16
LoRA alpha 16
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training epochs 1
Max sequence length 2048
Batch size 2 (gradient accumulation 4)
Dataset size 22,571 examples
Dataset language Portuguese
Hardware 1× NVIDIA Tesla A100
Quantisation 4-bit (bitsandbytes, training) / Q4_K_M GGUF (inference)

⚠️ Limitations & Development Phase

This model is in an active research and development phase. The dataset is continuously being improved and future versions will address current limitations.

  • Trained with an evolving dataset; the model may reproduce inconsistent information, including incorrect CVEs, imprecise MITRE ATT&CK sub-techniques, or YARA/SIGMA rules with invalid syntax
  • Optimised for Portuguese (PT/BR); responses in English may be less precise
  • 4B active parameter model (MoE); complex multi-step reasoning may require enabling thinking mode (<think>)
  • Not a replacement for a certified security analyst — use exclusively as a study and assistive tool
  • Internal benchmarks indicate an average score of 6.33/10 on advanced cybersecurity scenarios; improvements expected in upcoming versions

Roadmap

  • V5: expanded dataset focused on specific CVEs, exact MITRE ATT&CK sub-techniques, and valid SIGMA/YARA rules
  • V5: multi-epoch training with continuous eval loss monitoring
  • V5: comparative benchmark against Gemma 4 base as reference baseline

📜 License

This model is released under the Gemma Terms of Use. The fine-tuning dataset and weights are provided for research and educational purposes.


🏢 About

Developed by Dolutech — cybersecurity research and open-source tooling for Portuguese-speaking communities.

Website HuggingFace Model Repo GGUF Repo


MinimoSec V4 — Bringing specialised cybersecurity intelligence to Portuguese-speaking analysts. 🇵🇹🇧🇷

Downloads last month
546
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dolutech/MinimoSec-V4-4B-GGUF

Quantized
(173)
this model