How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="masafy/masafee-ctf-7b",
	filename="masafee-ctf-7b.q4_k_m.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Masafee CTF 7B

QLoRA fine-tune of Qwen 2.5 Coder 7B Instruct on CTFtime writeups โ€” trained entirely on a single NVIDIA GeForce RTX 3060 12 GB in 12 h 17 m of wall-clock time, with no cloud compute.

GitHub DOI ORCID

๐Ÿ“„ Paper: English (5 pp.) ยท ๆ—ฅๆœฌ่ชž (6 pp.) ยท evaluation report

This is part of the "Masafee" personal GPU research series โ€” the second release after masafee-lora (a Stable Diffusion LoRA of the same name).

Model details

Base model Qwen/Qwen2.5-Coder-7B-Instruct
Method QLoRA (r=32, ฮฑ=64, 4-bit) via unsloth
Training data justinwangx/CTFtime โ€” 18,013 writeup chunks โ†’ ~5,200 ร— 2048-token packed sequences (10.6M tokens)
Strategy Continued pretraining on raw writeup text (no instruction-format conversion)
Learning rate 2e-4, cosine schedule, 10 warmup steps
Epochs 2
Hardware NVIDIA GeForce RTX 3060 12 GB
Wall time 12 h 17 m
Final train loss 1.62
Final eval loss 1.644

Files in this repository

  • adapter/ โ€” LoRA adapter for use with PEFT
    • adapter_config.json, adapter_model.safetensors
    • tokenizer.json, tokenizer_config.json, chat_template.jinja
  • masafee-ctf-7b.q4_k_m.gguf โ€” single-file Q4_K_M GGUF (4.4 GB) for Ollama / llama.cpp

Usage

With Transformers + PEFT

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-7B-Instruct",
    torch_dtype=torch.bfloat16,
).to("cuda")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
model = PeftModel.from_pretrained(base, "masafy/masafee-ctf-7b", subfolder="adapter")

prompt = "How would you approach a CTF challenge that gives you an ELF binary with a gets() call?"
msgs = [{"role": "user", "content": prompt}]
ids = tokenizer.apply_chat_template(msgs, return_tensors="pt", add_generation_prompt=True).to("cuda")
out = model.generate(ids, max_new_tokens=400, do_sample=False)
print(tokenizer.decode(out[0][ids.shape[1]:], skip_special_tokens=True))

With Ollama (GGUF)

huggingface-cli download masafy/masafee-ctf-7b masafee-ctf-7b.q4_k_m.gguf

cat > Modelfile <<'MFILE'
FROM ./masafee-ctf-7b.q4_k_m.gguf
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
PARAMETER num_ctx 4096
MFILE

ollama create masafee-ctf-7b -f Modelfile
ollama run masafee-ctf-7b

Evaluation summary

Full report: GitHub EVALUATION.md ยท EVALUATION_ja.md

Benchmark Base Qwen masafee-ctf-7b Foundation-Sec-8B
CyberMetric-500 accuracy 86.20% 84.00% 82.60%
NYU CTF subset Pass@1 (30 Q.) 13.3% 0.0% 6.7%
Hedging phrases (sum / 30) โ€” 7 77

All three CyberMetric numbers fall within the 95% CI band (ยฑ3.1 pp at n=500). NYU CTF Bench was evaluated under a single-shot, non-agentic protocol which is strictly weaker than the official benchmark. Stylistic divergence from Foundation-Sec-8B (11ร— hedging ratio) reflects their respective training-data domains (CTF writeups vs SOC analysis), not a quality ranking.

Limitations

  • Style overfitting: continued pretraining on raw writeup text causes the model to emit writeup-formatted narrative that can consume the output budget before producing a final answer.
  • Hallucinated writeups: on out-of-distribution CTF prompts, the model occasionally generates plausible-but-wrong writeups for unrelated problems.
  • No agentic capability gain over the base model โ€” for solving real CTF challenges, use a larger model or an agent harness.

The model is intended as a CTF-style explainer and demonstrator of QLoRA on a consumer GPU, not as a CTF-solving agent.

License

  • LoRA adapter weights and GGUF in this repository: research and personal use only. These are derivative of CTFtime writeups whose copyright belongs to individual contributors; redistribution or commercial use is not permitted without explicit permission from those original authors.
  • Code / scripts / documentation / paper in the GitHub repository: MIT.
  • Base model (Qwen/Qwen2.5-Coder-7B-Instruct): Apache 2.0.

Citation

@software{suzuki_masafee_ctf_7b_2026,
  author       = {Suzuki, Masato},
  title        = {{Masafee CTF 7B: QLoRA Fine-Tuning of a 7B Code Model on
                   CTF Writeups for Stylistic and Knowledge Adaptation}},
  year         = {2026},
  version      = {v1.1.2},
  doi          = {10.5281/zenodo.20413080},
  url          = {https://doi.org/10.5281/zenodo.20413080},
  orcid        = {0009-0000-7977-2756}
}

Made by masafykun ยท masafy.org ยท ORCID ยท ๐Ÿพ

Downloads last month
26
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for masafy/masafee-ctf-7b

Base model

Qwen/Qwen2.5-7B
Adapter
(675)
this model