Instructions to use masafy/masafee-ctf-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use masafy/masafee-ctf-7b with PEFT:
Task type is invalid.
- llama-cpp-python
How to use masafy/masafee-ctf-7b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="masafy/masafee-ctf-7b", filename="masafee-ctf-7b.q4_k_m.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use masafy/masafee-ctf-7b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf masafy/masafee-ctf-7b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf masafy/masafee-ctf-7b:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf masafy/masafee-ctf-7b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf masafy/masafee-ctf-7b:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf masafy/masafee-ctf-7b:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf masafy/masafee-ctf-7b:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf masafy/masafee-ctf-7b:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf masafy/masafee-ctf-7b:Q4_K_M
Use Docker
docker model run hf.co/masafy/masafee-ctf-7b:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use masafy/masafee-ctf-7b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "masafy/masafee-ctf-7b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "masafy/masafee-ctf-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/masafy/masafee-ctf-7b:Q4_K_M
- Ollama
How to use masafy/masafee-ctf-7b with Ollama:
ollama run hf.co/masafy/masafee-ctf-7b:Q4_K_M
- Unsloth Studio
How to use masafy/masafee-ctf-7b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for masafy/masafee-ctf-7b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for masafy/masafee-ctf-7b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for masafy/masafee-ctf-7b to start chatting
- Pi
How to use masafy/masafee-ctf-7b with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf masafy/masafee-ctf-7b:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "masafy/masafee-ctf-7b:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use masafy/masafee-ctf-7b with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf masafy/masafee-ctf-7b:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default masafy/masafee-ctf-7b:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use masafy/masafee-ctf-7b with Docker Model Runner:
docker model run hf.co/masafy/masafee-ctf-7b:Q4_K_M
- Lemonade
How to use masafy/masafee-ctf-7b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull masafy/masafee-ctf-7b:Q4_K_M
Run and chat with the model
lemonade run user.masafee-ctf-7b-Q4_K_M
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf masafy/masafee-ctf-7b:Q4_K_M# Run inference directly in the terminal:
llama-cli -hf masafy/masafee-ctf-7b:Q4_K_MUse pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf masafy/masafee-ctf-7b:Q4_K_M# Run inference directly in the terminal:
./llama-cli -hf masafy/masafee-ctf-7b:Q4_K_MBuild from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf masafy/masafee-ctf-7b:Q4_K_M# Run inference directly in the terminal:
./build/bin/llama-cli -hf masafy/masafee-ctf-7b:Q4_K_MUse Docker
docker model run hf.co/masafy/masafee-ctf-7b:Q4_K_MMasafee CTF 7B
QLoRA fine-tune of Qwen 2.5 Coder 7B Instruct on CTFtime writeups โ trained entirely on a single NVIDIA GeForce RTX 3060 12 GB in 12 h 17 m of wall-clock time, with no cloud compute.
๐ Paper: English (5 pp.) ยท ๆฅๆฌ่ช (6 pp.) ยท evaluation report
This is part of the "Masafee" personal GPU research series โ the second release after masafee-lora (a Stable Diffusion LoRA of the same name).
Model details
| Base model | Qwen/Qwen2.5-Coder-7B-Instruct |
| Method | QLoRA (r=32, ฮฑ=64, 4-bit) via unsloth |
| Training data | justinwangx/CTFtime โ 18,013 writeup chunks โ ~5,200 ร 2048-token packed sequences (10.6M tokens) |
| Strategy | Continued pretraining on raw writeup text (no instruction-format conversion) |
| Learning rate | 2e-4, cosine schedule, 10 warmup steps |
| Epochs | 2 |
| Hardware | NVIDIA GeForce RTX 3060 12 GB |
| Wall time | 12 h 17 m |
| Final train loss | 1.62 |
| Final eval loss | 1.644 |
Files in this repository
adapter/โ LoRA adapter for use with PEFTadapter_config.json,adapter_model.safetensorstokenizer.json,tokenizer_config.json,chat_template.jinja
masafee-ctf-7b.q4_k_m.ggufโ single-file Q4_K_M GGUF (4.4 GB) for Ollama / llama.cpp
Usage
With Transformers + PEFT
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-7B-Instruct",
torch_dtype=torch.bfloat16,
).to("cuda")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
model = PeftModel.from_pretrained(base, "masafy/masafee-ctf-7b", subfolder="adapter")
prompt = "How would you approach a CTF challenge that gives you an ELF binary with a gets() call?"
msgs = [{"role": "user", "content": prompt}]
ids = tokenizer.apply_chat_template(msgs, return_tensors="pt", add_generation_prompt=True).to("cuda")
out = model.generate(ids, max_new_tokens=400, do_sample=False)
print(tokenizer.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
With Ollama (GGUF)
huggingface-cli download masafy/masafee-ctf-7b masafee-ctf-7b.q4_k_m.gguf
cat > Modelfile <<'MFILE'
FROM ./masafee-ctf-7b.q4_k_m.gguf
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
PARAMETER num_ctx 4096
MFILE
ollama create masafee-ctf-7b -f Modelfile
ollama run masafee-ctf-7b
Evaluation summary
Full report: GitHub EVALUATION.md ยท EVALUATION_ja.md
| Benchmark | Base Qwen | masafee-ctf-7b | Foundation-Sec-8B |
|---|---|---|---|
| CyberMetric-500 accuracy | 86.20% | 84.00% | 82.60% |
| NYU CTF subset Pass@1 (30 Q.) | 13.3% | 0.0% | 6.7% |
| Hedging phrases (sum / 30) | โ | 7 | 77 |
All three CyberMetric numbers fall within the 95% CI band (ยฑ3.1 pp at n=500). NYU CTF Bench was evaluated under a single-shot, non-agentic protocol which is strictly weaker than the official benchmark. Stylistic divergence from Foundation-Sec-8B (11ร hedging ratio) reflects their respective training-data domains (CTF writeups vs SOC analysis), not a quality ranking.
Limitations
- Style overfitting: continued pretraining on raw writeup text causes the model to emit writeup-formatted narrative that can consume the output budget before producing a final answer.
- Hallucinated writeups: on out-of-distribution CTF prompts, the model occasionally generates plausible-but-wrong writeups for unrelated problems.
- No agentic capability gain over the base model โ for solving real CTF challenges, use a larger model or an agent harness.
The model is intended as a CTF-style explainer and demonstrator of QLoRA on a consumer GPU, not as a CTF-solving agent.
License
- LoRA adapter weights and GGUF in this repository: research and personal use only. These are derivative of CTFtime writeups whose copyright belongs to individual contributors; redistribution or commercial use is not permitted without explicit permission from those original authors.
- Code / scripts / documentation / paper in the GitHub repository: MIT.
- Base model (
Qwen/Qwen2.5-Coder-7B-Instruct): Apache 2.0.
Citation
@software{suzuki_masafee_ctf_7b_2026,
author = {Suzuki, Masato},
title = {{Masafee CTF 7B: QLoRA Fine-Tuning of a 7B Code Model on
CTF Writeups for Stylistic and Knowledge Adaptation}},
year = {2026},
version = {v1.1.2},
doi = {10.5281/zenodo.20413080},
url = {https://doi.org/10.5281/zenodo.20413080},
orcid = {0009-0000-7977-2756}
}
Made by masafykun ยท masafy.org ยท ORCID ยท ๐พ
- Downloads last month
- 26
4-bit
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf masafy/masafee-ctf-7b:Q4_K_M# Run inference directly in the terminal: llama-cli -hf masafy/masafee-ctf-7b:Q4_K_M