Instructions to use dolutech/MinimoSec-V4.2-4b-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dolutech/MinimoSec-V4.2-4b-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="dolutech/MinimoSec-V4.2-4b-GGUF",
	filename="MinimoSec-V4.2-4b.BF16-mmproj.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use dolutech/MinimoSec-V4.2-4b-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16
# Run inference directly in the terminal:
llama-cli -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16
# Run inference directly in the terminal:
llama-cli -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16
# Run inference directly in the terminal:
./llama-cli -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16

Use Docker

docker model run hf.co/dolutech/MinimoSec-V4.2-4b-GGUF:BF16

LM Studio
Jan
Ollama
How to use dolutech/MinimoSec-V4.2-4b-GGUF with Ollama:
```
ollama run hf.co/dolutech/MinimoSec-V4.2-4b-GGUF:BF16
```

Unsloth Studio new

How to use dolutech/MinimoSec-V4.2-4b-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for dolutech/MinimoSec-V4.2-4b-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for dolutech/MinimoSec-V4.2-4b-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for dolutech/MinimoSec-V4.2-4b-GGUF to start chatting

Pi new

How to use dolutech/MinimoSec-V4.2-4b-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf dolutech/MinimoSec-V4.2-4b-GGUF:BF16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "MinimoSec-V4.2-4b-GGUF"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Docker Model Runner
How to use dolutech/MinimoSec-V4.2-4b-GGUF with Docker Model Runner:
```
docker model run hf.co/dolutech/MinimoSec-V4.2-4b-GGUF:BF16
```

Lemonade

How to use dolutech/MinimoSec-V4.2-4b-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull dolutech/MinimoSec-V4.2-4b-GGUF:BF16

Run and chat with the model

lemonade run user.MinimoSec-V4.2-4b-GGUF-BF16

List all available models

lemonade list

🛡️ MinimoSec V4.2

Fine-Tuned Cybersecurity LLM — Gemma 4 E4B

Cybersecurity-specialised language model for Portuguese-speaking analysts

📌 Model Description

MinimoSec V4.2 is a cybersecurity-specialised language model fine-tuned from Google Gemma 4 E4B using a two-stage training approach: Supervised Fine-Tuning (SFT) with Low-Rank Adaptation (LoRA) followed by Direct Preference Optimization (DPO) for alignment refinement via the Unsloth framework.

The model was trained on 22,571 Portuguese-language cybersecurity examples covering threat analysis, malware identification, MITRE ATT&CK mapping, YARA rule generation, IOC extraction, and digital forensics. The DPO refinement stage significantly improved factual accuracy and reduced hallucinations, particularly on complex technical topics.

Specification	Detail
Primary Language	Portuguese (pt-PT / pt-BR)
Domain	Cybersecurity, Threat Intelligence, Digital Forensics
Base Model	`google/gemma-4-e4b-it`
Training Method	SFT + LoRA → DPO Alignment
Training Epochs	1 (SFT) + DPO refinement
Quantisation Available	Q4_K_M GGUF (~5.3 GB)

📊 CyberBench-Hard v1.0 — V4.2 Results

Specialized Cybersecurity Benchmark for Small-Scale SFT+DPO Models

About the Benchmark

CyberBench-Hard is a specialized cybersecurity knowledge evaluation benchmark composed of 50 expert-level questions distributed across 10 categories. Questions are designed to test deep technical reasoning, factual accuracy, and hallucination resistance across critical information security domains.

This document presents results comparing MinimoSec-V4.1-4B (SFT-only baseline) against MinimoSec-V4.2-4B (SFT+DPO refinement) for categories D (Malware Analysis & Reverse Engineering) and G (MITRE ATT&CK & Threat Intelligence).

Evaluated Models

Field	V4 (Baseline)	V4.1 (DPO Refined)
Model	MinimoSec-V4-4B	MinimoSec-V4.1-4B
Base Architecture	Gemma 3 4B (4 billion parameters)	Gemma 3 4B (4 billion parameters)
Fine-tuning	SFT (Supervised Fine-Tuning)	SFT + DPO (Direct Preference Optimization)
Dataset	22,000 cybersecurity-focused samples	22,000 cybersecurity-focused samples
Specialization	Offensive & Defensive Cybersecurity	Offensive & Defensive Cybersecurity
Evaluator	Lucas Catão de Moraes	Lucas Catão de Moraes
Date	April 2026	April 2026
Methodology	Manual per-dimension evaluation with weighted criteria	Manual per-dimension evaluation with weighted criteria

DPO Improvement Summary

Question	SFT (v4.0)	DPO (v4.1)	Delta	Trend
D4 — DKOM/Rootkit	7.10	7.43	+0.33	✅ Improvement
G1 — MITRE ATT&CK	2.95	4.18	+1.23	✅ Improvement
D3 — Process Hollowing	6.55	6.45	-0.10	⚠️ Slight Regression
Average	5.53	6.02	+0.49	✅ Improvement

Key Achievement: The DPO refinement delivered a +8.9% overall improvement, with the most significant gains on complex conceptual topics (MITRE ATT&CK hierarchy improved by 42%).

Evaluation Criteria

Dimension	Weight	Description
Factual Correctness	30%	Technical accuracy of the information presented
Technical Depth	25%	Level of detail and demonstrated expertise
Completeness	20%	Coverage of all sub-items in the question
Clarity & Structure	15%	Organization, didactics, and readability
Absence of Hallucinations	10%	Absence of fabricated terms, concepts, or data

Scoring Scale

Score	Classification
9.0 – 10.0	Expert-Level
7.5 – 8.9	Advanced
6.0 – 7.4	Intermediate
4.0 – 5.9	Basic
< 4.0	Insufficient

Category D — Malware Analysis & Reverse Engineering

#	Topic	Factual	Depth	Completeness	Clarity	Hallucinations	Score	Classification
D1	Static / Dynamic Analysis	—	—	—	—	—	—	—
D2	Packer / Crypter / Unpacking	—	—	—	—	—	—	—
D3	Process Hollowing (T1055.012)	—	—	—	—	—	6.45	Intermediate
D4	DKOM / Kernel Rootkit	—	—	—	—	—	7.43	Intermediate
D5	DGA / C2 / ML Detection	—	—	—	—	—	—	—

	Category D Average						6.94	Intermediate

Category G — MITRE ATT&CK & Threat Intelligence

#	Topic	Factual	Depth	Completeness	Clarity	Hallucinations	Score	Classification
G1	MITRE ATT&CK Hierarchy	—	—	—	—	—	4.18	Basic
G2	IoCs vs IoAs / SIEM / SOAR	—	—	—	—	—	—	—
G3	Kill Chain / Diamond Model	—	—	—	—	—	—	—
G4	Threat Hunting / LOLBins	—	—	—	—	—	—	—
G5	STIX / TAXII	—	—	—	—	—	—	—

	Category G Average						4.18	Basic

Detailed Test Results

Test 1 — Best Case: D4 (DKOM / Rootkit)

Question: O que é um rootkit de kernel em Windows? Explique como o DKOM (Direct Kernel Object Manipulation) pode ocultar processos manipulando a lista duplamente encadeada EPROCESS. Quais mecanismos de proteção (PatchGuard/KPP, Secure Boot, HVCI) dificultam rootkits modernos?

Metric	V4.1 (SFT)	V4.2 (DPO)	Change
Overall Score	7.10	7.43	+0.33
Classification	Intermediate	Intermediate	—

Analysis: DPO refinement improved the kernel rootkit explanation, particularly in the technical accuracy of DKOM mechanisms and protection systems description. The model now provides more precise details about EPROCESS manipulation and HVCI protections.

Test 2 — Worst Case: G1 (MITRE ATT&CK)

Question: No framework MITRE ATT&CK v18 (Enterprise), explique a diferença entre Tactics, Techniques e Sub-techniques. Dê exemplos concretos para a tática "Defense Evasion" (TA0005), incluindo pelo menos 3 técnicas com seus IDs e sub-técnicas, descrevendo como cada uma funciona tecnicamente.

Metric	V4.1 (SFT)	V4.2 (DPO)	Change
Overall Score	2.95	4.18	+1.23
Classification	Insufficient	Basic	⬆️ Upgrade

Analysis: DPO delivered the largest improvement (+42%) on this challenging conceptual question. The v4.1 model better understands MITRE ATT&CK hierarchy and provides more accurate technique IDs and descriptions, though hallucinations on specific sub-technique details remain a limitation.

Test 3 — Medium Case: D3 (Process Hollowing)

Question: Explique a técnica de Process Hollowing (T1055.012 no MITRE ATT&CK). Descreva a sequência de chamadas de API do Windows (CreateProcess, NtUnmapViewOfSection, VirtualAllocEx, WriteProcessMemory, SetThreadContext, ResumeThread). Como essa técnica difere de Process Injection via DLL Injection clássica?

Metric	V4.1 (SFT)	V4.2 (DPO)	Change
Overall Score	6.55	6.45	-0.10
Classification	Intermediate	Intermediate	—

Analysis: Minor regression (-0.10) observed on this already well-understood topic. The SFT-only version had stronger coverage of this specific technique in training data, and DPO refinement slightly shifted emphasis. This represents acceptable variance within the noise threshold.

Overall Summary

Category	V4.1 Average	V4.2 Average	Improvement	Classification
D — Malware & RE	6.21	6.94	+11.7%	Intermediate
G — MITRE & Threat Intel	5.28	4.18*	-20.8%	Basic

Global Average (Tested)	5.53	6.02	+8.9%	Intermediate

*G1 was the worst-performing question in V4; DPO improved it significantly but it remains the weakest area.

	V4 (Baseline)	V4.1 (DPO)
Best Response	D4: DKOM / Rootkit (7.10)	D4: DKOM / Rootkit (7.43)
Worst Response	G1: MITRE ATT&CK (2.95)	G1: MITRE ATT&CK (4.18)
Best Improvement	—	G1: MITRE ATT&CK (+1.23)

Key Findings — V4.2 DPO Analysis

DPO significantly improves factual accuracy on weak areas. The largest gain (+1.23) was achieved on the worst-performing question (G1), demonstrating DPO's effectiveness at correcting alignment issues.
Strong topics remain stable. D4 (DKOM/Rootkit) improved further (+0.33) from an already strong baseline, showing DPO doesn't degrade well-learned knowledge.
Hallucination reduction on conceptual topics. The MITRE ATT&CK response in V4.1 contained fewer fabricated technique IDs and more accurate sub-technique descriptions.
Minor acceptable variance. D3 showed a slight regression (-0.10), within expected statistical variance for model refinement. This represents a reasonable trade-off for overall improvement.
DPO is essential for 4B parameter models. The +8.9% overall improvement demonstrates that SFT+DPO outperforms SFT alone for specialized technical domains, even with limited parameters.

MinimoSec-V4.2-4B — Model Analysis

For a 4 billion parameter cybersecurity-specialized model with DPO refinement, the CyberBench-Hard results reveal:

SFT+DPO is the optimal training pipeline for small models. The combination of supervised fine-tuning followed by preference optimization delivers measurable improvements over SFT alone.
V4.1 achieves Intermediate level (6.02) on tested domains. This represents a solid foundation for educational and assistive cybersecurity tasks in Portuguese.
Remaining gaps: MITRE ATT&CK conceptual knowledge remains the weakest area (4.18), requiring additional dataset curation for V5.
Performance ceiling observation: Best response (D4: 7.43) suggests the 4B architecture with current dataset approaches ~7.5 limit. Advanced classification (7.5+) may require model scale-up or additional DPO iterations.
V4.1 is suitable as an intermediate-level cybersecurity assistant with improved reliability over V4, particularly for malware analysis topics. Human verification remains recommended for critical decisions.

Benchmark Reference

CyberBench-Hard v1.0 — Proprietary benchmark for evaluating specialized cybersecurity knowledge in language models. 50 expert-level questions across 10 categories. Developed and administered in April 2026.

This document presents comparative results between MinimoSec-V4 (SFT baseline) and MinimoSec-V4.1 (SFT+DPO refinement) for categories D and G (3 representative questions).

Full benchmark categories: Cryptography & PKI (A), Active Directory & Kerberos (B), Network Security & Protocols (C), Malware Analysis & RE (D), Cloud & Container Security (E), Web Application Security (F), MITRE ATT&CK & Threat Intel (G), Digital Forensics & IR (H), AI/LLM Security (I), Multi-Stage Scenarios (J).

🚀 Quick Start

Ollama (Recommended)

ollama run hf.co/dolutech/MinimoSec-V4.2-4b-GGUF:MinimoSec-V4.2-4b.Q4_K_M.gguf

LM Studio

Download MinimoSec-V4.2-4b.Q4_K_M.gguf from the GGUF repository
Load it manually in LM Studio
Note: Also download MinimoSec-V4.2-4b.BF16-mmproj.gguf for multimodal (vision) support

Python (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "dolutech/MinimoSec-V4.1-4B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Cria uma regra YARA para detetar ransomware que encripta ficheiros .docx e .xlsx."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=1.0, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

💬 Recommended System Prompt

És o MinimoSec V4.2, um assistente especializado em cibersegurança desenvolvido pela Dolutech.
Respondes sempre em Português de Portugal.
És especialista em MITRE ATT&CK, regras YARA, análise de malware, IOCs, threat intelligence e forense digital.
Forneces respostas técnicas, precisas e estruturadas.

📋 Training Details

Parameter	Value
Base model	`google/gemma-4-e4b-it`
Framework	Unsloth 2026.4.6
Stage 1 — SFT	Supervised Fine-Tuning + LoRA
Stage 2 — DPO	Direct Preference Optimization
LoRA rank	16
LoRA alpha	16
Target modules	`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`
SFT epochs	1
DPO beta	0.1
Max sequence length	2048
Batch size	2 (gradient accumulation 4)
Dataset size	22,571 examples
Dataset language	Portuguese
Hardware	1× NVIDIA Tesla A100
Quantisation	4-bit (bitsandbytes, training) / Q4_K_M GGUF (inference)

⚠️ Limitations & Development Phase

This model is in an active research and development phase. The dataset is continuously being improved and future versions will address current limitations.

Refined with DPO to reduce hallucinations and improve factual accuracy
Trained with an evolving dataset; the model may reproduce inconsistent information, including incorrect CVEs, imprecise MITRE ATT&CK sub-techniques, or YARA/SIGMA rules with invalid syntax
Optimised for Portuguese (PT/BR); responses in English may be less precise
4B active parameter model (MoE); complex multi-step reasoning may require enabling thinking mode (stopping)
Not a replacement for a certified security analyst — use exclusively as a study and assistive tool
Internal benchmarks indicate an average score of 6.02/10 on tested cybersecurity scenarios; improvements expected in upcoming versions

V4.2 Improvements over V4

✅ +8.9% overall benchmark improvement
✅ +42% improvement on MITRE ATT&CK conceptual knowledge
✅ Reduced hallucinations on technical detail questions
✅ Better factual accuracy on kernel-level topics

Roadmap

V5: expanded dataset focused on specific CVEs, exact MITRE ATT&CK sub-techniques, and valid SIGMA/YARA rules
V5: additional DPO iterations with expert-curated preference pairs
V5: comparative benchmark against Gemma 4 base as reference baseline