CVE Backport Code Generation โ Qwen2.5-Coder-32B
Fine-tuned Qwen2.5-Coder-32B-Instruct for security patch backporting via per-hunk code generation.
Instead of generating unified diffs directly, this model takes a vulnerable code region and a fix description, and outputs the fixed version of the code. A programmatic diff then produces the final patch. This plays to LLM strengths in code completion and avoids format-sensitivity issues.
Quick Start
The easiest way to use this model is with the cve-backport-tool CLI, which handles the full pipeline: parse upstream patch, extract per-hunk regions, call the model, and reconstruct a unified diff.
# Download and serve the model
./setup.sh
# Generate a backport patch
python3 cve-backport.py \
--cve CVE-2025-3887 \
--package gstreamer-plugins-bad \
--patch upstream.patch \
--obs-fetch --backend openai --retry 3
GGUF Downloads
| File | Quant | Size | Dataset | Notes |
|---|---|---|---|---|
cve-backport-codegen-v3-q8_0.gguf |
Q8_0 | 33 GB | 35,667 examples | Recommended โ highest precision |
cve-backport-codegen-v2-q8_0.gguf |
Q8_0 | 33 GB | 24,452 examples | Previous release |
cve-backport-codegen-v1-q8_0.gguf |
Q8_0 | 33 GB | 17,007 examples | First release |
Older GGUFs (v1, v2) are available in the legacy repo.
Evaluation
Per-hunk evaluation on held-out test cases the model never saw during training:
| Metric | v1 | v2 | v3 |
|---|---|---|---|
| Average recall | 91% | 94% | 94% |
| Average precision | โ | 93% | 98% |
| Exact match | โ | 15/20 | 16/20 |
| Perfect hunks (>=95%) | 16/18 | 17/20 | 17/20 |
| Fail (<10%) | 1/18 | 0/20 | 0/20 |
By tier:
- Identical (upstream patch applies directly): 95% recall, 98% precision
- Adapted (line numbers/context differ): 89% recall, 97% precision
Training Details
| v1 | v2 | v3 | |
|---|---|---|---|
| Dataset | 17,007 | 24,452 | 35,667 |
| Learning rate | 2e-4 | 2e-4 | 1e-4 |
| Epochs | 2 | 2 | 2 |
| Training time | 13h | 27h | 41h |
| Hardware | H100 NVL | H100 NVL | H100 NVL |
| Method | QLoRA r=64 ฮฑ=128 | QLoRA r=64 ฮฑ=128 | QLoRA r=64 ฮฑ=128 |
v3 uses a lower learning rate (1e-4 vs 2e-4) for stability with the larger dataset, and includes data quality filtering to remove toxic examples (XML test data, huge outputs) that caused training instability in earlier runs.
Prompt Format
ChatML format. Each prompt covers one hunk region with 15 lines of context padding:
System:
You are a security patch backporting assistant.
Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.
Rules:
- Output ONLY the fixed code, nothing else โ no explanations, no markdown fences
- Preserve exact formatting, indentation, and style of the original
- Make ONLY the changes described in the fix โ do not modify anything else
- Do not add comments about what you changed
User:
## File: lib/url.c
## Lines: 100-130
\`\`\`c
<vulnerable source code region>
\`\`\`
## Fix
CVE-2024-1234: fix buffer overflow in url_parse()
\`\`\`diff
<upstream patch>
\`\`\`
Assistant: (the fixed source code, raw, no fences)
Training Data
anicka/cve-backport-codegen-dataset โ 35,667 per-hunk examples from openSUSE maintenance patches, covering 90+ packages and 2,300+ CVEs.
Intended Use
This model assists with security patch backporting in Linux distribution maintenance. It is a research tool โ all generated patches must be reviewed by a maintainer before application. "Applies and builds" validates mechanical correctness, not semantic correctness.
Links
- Tool: github.com/openSUSE/cve-backport-tool
- Dataset: anicka/cve-backport-codegen-dataset
- Legacy repo (v1/v2 GGUFs + safetensors): anicka/cve-backport-codegen-qwen25-32b-v1
License
Apache-2.0 (inherited from Qwen2.5-Coder-32B-Instruct).
- Downloads last month
- -
8-bit
Model tree for anicka/cve-backport-codegen-qwen25-32b
Base model
Qwen/Qwen2.5-32B