Instructions to use anicka/cve-backport-codegen-v5-qwen25-32b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use anicka/cve-backport-codegen-v5-qwen25-32b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct") model = PeftModel.from_pretrained(base_model, "anicka/cve-backport-codegen-v5-qwen25-32b") - Transformers
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="anicka/cve-backport-codegen-v5-qwen25-32b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("anicka/cve-backport-codegen-v5-qwen25-32b", dtype="auto") - llama-cpp-python
How to use anicka/cve-backport-codegen-v5-qwen25-32b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="anicka/cve-backport-codegen-v5-qwen25-32b", filename="cve-backport-codegen-v5-q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use anicka/cve-backport-codegen-v5-qwen25-32b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0 # Run inference directly in the terminal: llama-cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0 # Run inference directly in the terminal: llama-cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Use Docker
docker model run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
- LM Studio
- Jan
- vLLM
How to use anicka/cve-backport-codegen-v5-qwen25-32b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "anicka/cve-backport-codegen-v5-qwen25-32b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anicka/cve-backport-codegen-v5-qwen25-32b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
- SGLang
How to use anicka/cve-backport-codegen-v5-qwen25-32b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "anicka/cve-backport-codegen-v5-qwen25-32b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anicka/cve-backport-codegen-v5-qwen25-32b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "anicka/cve-backport-codegen-v5-qwen25-32b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anicka/cve-backport-codegen-v5-qwen25-32b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Ollama:
ollama run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
- Unsloth Studio new
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for anicka/cve-backport-codegen-v5-qwen25-32b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for anicka/cve-backport-codegen-v5-qwen25-32b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for anicka/cve-backport-codegen-v5-qwen25-32b to start chatting
- Pi new
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Run Hermes
hermes
- Docker Model Runner
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Docker Model Runner:
docker model run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
- Lemonade
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
Run and chat with the model
lemonade run user.cve-backport-codegen-v5-qwen25-32b-Q8_0
List all available models
lemonade list
CVE Backport Codegen v5 — Qwen2.5-Coder-32B QLoRA
Fine-tuned code generation model for backporting upstream CVE security fixes to older SUSE/openSUSE package versions. Given vulnerable source code and an upstream fix description, the model outputs the corrected code. A separate tool then diffs the output against the original to produce a patch.
This is a per-hunk code generation approach: the model sees one region of source code at a time and returns the fixed version, rather than generating raw unified diffs. This yields higher accuracy than patch-format models because the model works in its natural domain (code) rather than a meta-format (diffs).
MoE sibling now available: anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b reaches 91.9% recall on the same n=100 eval (within 1.2 pt of this model) while running ~10× faster at inference, thanks to Qwen3-Coder-30B-A3B's sparse 3B-active MoE architecture. Same training data, same config style, trained in 1/5 the wall time on a single H100.
What's New in v5
v5 uses a unified codegen-only dataset — all 36,166 training examples follow the same 3-turn format (system / user with code + fix description / assistant with fixed code). v4 mixed in 5-turn test-generation examples; v5 drops those to focus entirely on codegen quality.
| Metric | v5 | v4 | v1 |
|---|---|---|---|
| Recall | 93.1% | 93% | 91% |
| Precision | 94.4% | 95% | — |
| Exact match | 83/100 | 87/100 | — |
| Adapted recall | 90.0% | 86% | 71% |
| Identical recall | 93.7% | 94% | 94% |
Adapted-tier recall has steadily improved: 71% (v1) → 86% (v4) → 90% (v5). The codegen-only dataset gives the model a cleaner training signal for the core task.
Model Details
| Base model | Qwen/Qwen2.5-Coder-32B-Instruct |
| Method | QLoRA (4-bit NF4, double quantization, bf16 compute) |
| LoRA rank / alpha | 64 / 128 |
| LoRA dropout | 0.05 |
| LoRA targets | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training data | 36,166 train / 1,834 eval examples |
| Epochs | 2 (8,228 steps) |
| Effective batch size | 8 (1 × grad_accum 8) |
| Learning rate | 1e-4 (cosine schedule, 5% warmup) |
| Max sequence length | 4,096 tokens |
| Optimizer | AdamW fused, weight decay 0.01 |
| Hardware | 2× NVIDIA H100 NVL 94GB |
| Training time | 46.1 hours |
| Train loss (avg) | 0.0215 |
| Eval loss (final) | 0.00602 |
| PEFT version | 0.18.1 |
Files
This repository contains:
- LoRA adapter (
adapter_model.safetensors,adapter_config.json) — merge with the base model using PEFT - GGUF Q8_0 (
cve-backport-codegen-v5-q8_0.gguf, 33GB) — ready for llama.cpp / ollama
Reproduction via Teapot
This model was trained via the teapot training pipeline. The full reproduction is a four-command sequence once the cve-backport dataset is prepared:
git clone https://github.com/anicka-net/teapot
cd teapot
pip install -e .
# 1. Compose training data from the cve-backport module
teapot compose configs/cve-backport.config \
--output train-cve-backport.jsonl
# 2. Generate the QLoRA-HF launch script
teapot train configs/cve-backport.config \
--backend qlora-hf \
--train-data train-cve-backport.jsonl \
--eval-data eval-cve-backport.jsonl \
--output train-cve-backport.sh
# 3. Train (2× H100 NVL 94GB; ~46 hours)
bash train-cve-backport.sh
# 4. Final adapter is at output-teapot-cve-backport/final/
The teapot config (configs/cve-backport.config) pins all the hyperparameters:
method: qlora, epochs: 2, lr: 1e-4, batch_size: 1, gradient_accumulation: 8,
lora_r: 64, lora_alpha: 128, max_length: 4096, warmup_ratio: 0.05,
hardware.gpus: 2. See the config file in the teapot repo for the full
declaration.
The qlora-hf backend invokes python3 -m teapot.train_qlora_hf, which is
a thin wrapper over the HuggingFace Trainer with bitsandbytes 4-bit
quantization and PEFT LoRA. Training data is composed from the
cve-backport-codegen-dataset
HF repo (the domain/cve-backport teapot module fetches it automatically).
Evaluation
Evaluated on 100 held-out examples (zero CVE overlap with training) using the Q8_0 GGUF served via llama-server (temperature=0, ctx=8192).
Overall
| Metric | Value |
|---|---|
| Avg recall | 93.1% |
| Avg precision | 94.4% |
| Exact match | 83/100 |
| Perfect (100% recall) | 90/100 |
| Failures (0% recall) | 3/100 |
By Tier
| Tier | Count | Avg Recall | Perfect |
|---|---|---|---|
| Identical (upstream applies as-is) | 85 | 93.7% | 77/85 |
| Adapted (requires modification) | 15 | 90.0% | 13/15 |
Failure Analysis
The 3 zero-recall cases are all complex libvirt patches (multi-function adaptations across large files with significant structural differences between versions). These are known hard cases that likely require an agentic approach with source tree context.
Training Data
The v5 dataset contains real SUSE/openSUSE maintenance patches paired with their upstream CVE fixes, converted to a per-hunk codegen format:
- 36,166 train + 1,834 eval examples (strict CVE-level split, zero overlap)
- All examples use a 3-turn ChatML format (system / user / assistant)
- Per-hunk extraction with 15-line context padding, nearby hunks merged
- Covers C, C++, Python, shell, Java, JavaScript, Go, and more
- Sources: openSUSE Build Service maintenance incidents
Input Format
## File: path/to/file.c
## Lines: 100-130
```c
/* 15 lines before the change */
vulnerable_code_here();
/* 15 lines after the change */
Fix
Description of what the upstream patch changes in this region.
### Output Format
The model outputs the fixed version of the code region (just the code,
no diff headers or markup).
## Usage
### With llama.cpp / llama-server (GGUF)
```bash
llama-server \
--model cve-backport-codegen-v5-q8_0.gguf \
--port 8403 \
--n-gpu-layers 99 \
--ctx-size 8192
With the CVE Backport Tool
The recommended way to use this model is via the cve-backport-tool, which handles patch parsing, source extraction, model inference, and diff generation:
python3 cve-backport.py \
--cve CVE-2024-1234 \
--package openssl-1.1.1d \
--patch upstream.patch \
--source-dir /path/to/source/ \
--backend openai \
--retry 3
With transformers + PEFT (adapter)
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-32B-Instruct",
torch_dtype="bfloat16",
device_map="auto",
)
model = PeftModel.from_pretrained(base, "anicka/cve-backport-codegen-v5-qwen25-32b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")
Prompt Template (ChatML)
<|im_start|>system
You are a security patch backporting assistant.
Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.
Rules:
- Output ONLY the fixed code, nothing else
- Preserve all surrounding context exactly
- Apply only the described fix
<|im_end|>
<|im_start|>user
## File: crypto/bn/bn.h
## Lines: 280-310
```c
/* source code region */
Fix
Add bounds check for BN_num_bits to prevent buffer over-read. <|im_end|> <|im_start|>assistant
## Limitations
- **Best at identical-tier patches** (upstream fix applies directly) — 93.7% recall
- **Good at adapted patches** (90% recall) but complex multi-function adaptations
across structurally different versions remain challenging
- **Context window**: 4,096 token training limit means very large functions or
multi-file patches may be truncated
- **No compilation feedback**: the model generates code in a single pass without
verifying it compiles. Use `--retry` in the CLI tool for iterative correction.
- Always review generated patches before applying to production systems
## Related
- **MoE sibling**: [anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b](https://huggingface.co/anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b) — Qwen3-Coder-30B-A3B (3B active, MoE), 91.9% recall on the same n=100 eval, ~10× faster inference
- **openSUSE mirror**: [openSUSE/CVE-Backport-Qwen2.5-Coder-32B](https://huggingface.co/openSUSE/CVE-Backport-Qwen2.5-Coder-32B)
- **CLI tool**: [openSUSE/cve-backport-tool](https://github.com/openSUSE/cve-backport-tool)
- **Dataset**: [anicka/cve-backport-codegen-dataset](https://huggingface.co/datasets/anicka/cve-backport-codegen-dataset)
- **Training pipeline**: [teapot](https://github.com/anicka-net/teapot)
- **Previous version (v1)**: [anicka/cve-backport-codegen-qwen25-32b-v1](https://huggingface.co/anicka/cve-backport-codegen-qwen25-32b-v1)
## Citation
```bibtex
@misc{cve-backport-codegen-v5,
title={CVE Backport Codegen v5: Fine-tuned Qwen2.5-Coder-32B for Security Patch Backporting},
author={Anna Maresova},
year={2026},
url={https://huggingface.co/anicka/cve-backport-codegen-v5-qwen25-32b}
}
- Downloads last month
- 37
8-bit
Model tree for anicka/cve-backport-codegen-v5-qwen25-32b
Base model
Qwen/Qwen2.5-32BDataset used to train anicka/cve-backport-codegen-v5-qwen25-32b
Evaluation results
- Recall on CVE Backport Codegen Datasetself-reported0.931
- Precision on CVE Backport Codegen Datasetself-reported0.944
- Exact Match on CVE Backport Codegen Datasetself-reported0.830