Instructions to use anicka/cve-backport-codegen-v5-qwen25-32b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use anicka/cve-backport-codegen-v5-qwen25-32b with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")
model = PeftModel.from_pretrained(base_model, "anicka/cve-backport-codegen-v5-qwen25-32b")

Transformers

How to use anicka/cve-backport-codegen-v5-qwen25-32b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="anicka/cve-backport-codegen-v5-qwen25-32b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("anicka/cve-backport-codegen-v5-qwen25-32b", dtype="auto")

llama-cpp-python

How to use anicka/cve-backport-codegen-v5-qwen25-32b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="anicka/cve-backport-codegen-v5-qwen25-32b",
	filename="cve-backport-codegen-v5-q8_0.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use anicka/cve-backport-codegen-v5-qwen25-32b with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
# Run inference directly in the terminal:
llama cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
# Run inference directly in the terminal:
llama cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Use Docker

docker model run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

LM Studio
Jan

vLLM

How to use anicka/cve-backport-codegen-v5-qwen25-32b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "anicka/cve-backport-codegen-v5-qwen25-32b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "anicka/cve-backport-codegen-v5-qwen25-32b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

SGLang

How to use anicka/cve-backport-codegen-v5-qwen25-32b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "anicka/cve-backport-codegen-v5-qwen25-32b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "anicka/cve-backport-codegen-v5-qwen25-32b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "anicka/cve-backport-codegen-v5-qwen25-32b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "anicka/cve-backport-codegen-v5-qwen25-32b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Ollama:
```
ollama run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
```

Unsloth Studio

How to use anicka/cve-backport-codegen-v5-qwen25-32b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for anicka/cve-backport-codegen-v5-qwen25-32b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for anicka/cve-backport-codegen-v5-qwen25-32b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for anicka/cve-backport-codegen-v5-qwen25-32b to start chatting

How to use anicka/cve-backport-codegen-v5-qwen25-32b with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use anicka/cve-backport-codegen-v5-qwen25-32b with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use anicka/cve-backport-codegen-v5-qwen25-32b with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use anicka/cve-backport-codegen-v5-qwen25-32b with Docker Model Runner:
```
docker model run hf.co/anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0
```

Lemonade

How to use anicka/cve-backport-codegen-v5-qwen25-32b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull anicka/cve-backport-codegen-v5-qwen25-32b:Q8_0

Run and chat with the model

lemonade run user.cve-backport-codegen-v5-qwen25-32b-Q8_0

List all available models

lemonade list

CVE Backport Codegen v5 — Qwen2.5-Coder-32B QLoRA

Fine-tuned code generation model for backporting upstream CVE security fixes to older SUSE/openSUSE package versions. Given vulnerable source code and an upstream fix description, the model outputs the corrected code. A separate tool then diffs the output against the original to produce a patch.

This is a per-hunk code generation approach: the model sees one region of source code at a time and returns the fixed version, rather than generating raw unified diffs. This yields higher accuracy than patch-format models because the model works in its natural domain (code) rather than a meta-format (diffs).

MoE sibling now available: anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b reaches 91.9% recall on the same n=100 eval (within 1.2 pt of this model) while running ~10× faster at inference, thanks to Qwen3-Coder-30B-A3B's sparse 3B-active MoE architecture. Same training data, same config style, trained in 1/5 the wall time on a single H100.

What's New in v5

v5 uses a unified codegen-only dataset — all 36,166 training examples follow the same 3-turn format (system / user with code + fix description / assistant with fixed code). v4 mixed in 5-turn test-generation examples; v5 drops those to focus entirely on codegen quality.

Metric	v5	v4	v1
Recall	93.1%	93%	91%
Precision	94.4%	95%	—
Exact match	83/100	87/100	—
Adapted recall	90.0%	86%	71%
Identical recall	93.7%	94%	94%

Adapted-tier recall has steadily improved: 71% (v1) → 86% (v4) → 90% (v5). The codegen-only dataset gives the model a cleaner training signal for the core task.

Model Details


Base model	Qwen/Qwen2.5-Coder-32B-Instruct
Method	QLoRA (4-bit NF4, double quantization, bf16 compute)
LoRA rank / alpha	64 / 128
LoRA dropout	0.05
LoRA targets	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training data	36,166 train / 1,834 eval examples
Epochs	2 (8,228 steps)
Effective batch size	8 (1 × grad_accum 8)
Learning rate	1e-4 (cosine schedule, 5% warmup)
Max sequence length	4,096 tokens
Optimizer	AdamW fused, weight decay 0.01
Hardware	2× NVIDIA H100 NVL 94GB
Training time	46.1 hours
Train loss (avg)	0.0215
Eval loss (final)	0.00602
PEFT version	0.18.1

Files

This repository contains:

LoRA adapter (adapter_model.safetensors, adapter_config.json) — merge with the base model using PEFT
GGUF Q8_0 (cve-backport-codegen-v5-q8_0.gguf, 33GB) — ready for llama.cpp / ollama

Reproduction via Teapot

This model was trained via the teapot training pipeline. The full reproduction is a four-command sequence once the cve-backport dataset is prepared:

git clone https://github.com/anicka-net/teapot
cd teapot
pip install -e .

# 1. Compose training data from the cve-backport module
teapot compose configs/cve-backport.config \
    --output train-cve-backport.jsonl

# 2. Generate the QLoRA-HF launch script
teapot train configs/cve-backport.config \
    --backend qlora-hf \
    --train-data train-cve-backport.jsonl \
    --eval-data eval-cve-backport.jsonl \
    --output train-cve-backport.sh

# 3. Train (2× H100 NVL 94GB; ~46 hours)
bash train-cve-backport.sh

# 4. Final adapter is at output-teapot-cve-backport/final/

The teapot config (configs/cve-backport.config) pins all the hyperparameters: method: qlora, epochs: 2, lr: 1e-4, batch_size: 1, gradient_accumulation: 8, lora_r: 64, lora_alpha: 128, max_length: 4096, warmup_ratio: 0.05, hardware.gpus: 2. See the config file in the teapot repo for the full declaration.

The qlora-hf backend invokes python3 -m teapot.train_qlora_hf, which is a thin wrapper over the HuggingFace Trainer with bitsandbytes 4-bit quantization and PEFT LoRA. Training data is composed from the cve-backport-codegen-dataset HF repo (the domain/cve-backport teapot module fetches it automatically).

Evaluation

Evaluated on 100 held-out examples (zero CVE overlap with training) using the Q8_0 GGUF served via llama-server (temperature=0, ctx=8192).

Overall

Metric	Value
Avg recall	93.1%
Avg precision	94.4%
Exact match	83/100
Perfect (100% recall)	90/100
Failures (0% recall)	3/100

By Tier

Tier	Count	Avg Recall	Perfect
Identical (upstream applies as-is)	85	93.7%	77/85
Adapted (requires modification)	15	90.0%	13/15

Failure Analysis

The 3 zero-recall cases are all complex libvirt patches (multi-function adaptations across large files with significant structural differences between versions). These are known hard cases that likely require an agentic approach with source tree context.

Training Data

The v5 dataset contains real SUSE/openSUSE maintenance patches paired with their upstream CVE fixes, converted to a per-hunk codegen format:

36,166 train + 1,834 eval examples (strict CVE-level split, zero overlap)
All examples use a 3-turn ChatML format (system / user / assistant)
Per-hunk extraction with 15-line context padding, nearby hunks merged
Covers C, C++, Python, shell, Java, JavaScript, Go, and more
Sources: openSUSE Build Service maintenance incidents

Input Format

## File: path/to/file.c
## Lines: 100-130

```c
/* 15 lines before the change */
vulnerable_code_here();
/* 15 lines after the change */

Fix

Description of what the upstream patch changes in this region.


### Output Format

The model outputs the fixed version of the code region (just the code,
no diff headers or markup).

## Usage

### With llama.cpp / llama-server (GGUF)

```bash
llama-server \
    --model cve-backport-codegen-v5-q8_0.gguf \
    --port 8403 \
    --n-gpu-layers 99 \
    --ctx-size 8192

With the CVE Backport Tool

The recommended way to use this model is via the cve-backport-tool, which handles patch parsing, source extraction, model inference, and diff generation:

python3 cve-backport.py \
    --cve CVE-2024-1234 \
    --package openssl-1.1.1d \
    --patch upstream.patch \
    --source-dir /path/to/source/ \
    --backend openai \
    --retry 3

With transformers + PEFT (adapter)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-32B-Instruct",
    torch_dtype="bfloat16",
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "anicka/cve-backport-codegen-v5-qwen25-32b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")

Prompt Template (ChatML)

<|im_start|>system
You are a security patch backporting assistant.

Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.

Rules:
- Output ONLY the fixed code, nothing else
- Preserve all surrounding context exactly
- Apply only the described fix
<|im_end|>
<|im_start|>user
## File: crypto/bn/bn.h
## Lines: 280-310

```c
/* source code region */

Fix

Add bounds check for BN_num_bits to prevent buffer over-read. <|im_end|> <|im_start|>assistant


## Limitations

- **Best at identical-tier patches** (upstream fix applies directly) — 93.7% recall
- **Good at adapted patches** (90% recall) but complex multi-function adaptations
  across structurally different versions remain challenging
- **Context window**: 4,096 token training limit means very large functions or
  multi-file patches may be truncated
- **No compilation feedback**: the model generates code in a single pass without
  verifying it compiles. Use `--retry` in the CLI tool for iterative correction.
- Always review generated patches before applying to production systems

## Related

- **MoE sibling**: [anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b](https://huggingface.co/anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b) — Qwen3-Coder-30B-A3B (3B active, MoE), 91.9% recall on the same n=100 eval, ~10× faster inference
- **openSUSE mirror**: [openSUSE/CVE-Backport-Qwen2.5-Coder-32B](https://huggingface.co/openSUSE/CVE-Backport-Qwen2.5-Coder-32B)
- **CLI tool**: [openSUSE/cve-backport-tool](https://github.com/openSUSE/cve-backport-tool)
- **Dataset**: [anicka/cve-backport-codegen-dataset](https://huggingface.co/datasets/anicka/cve-backport-codegen-dataset)
- **Training pipeline**: [teapot](https://github.com/anicka-net/teapot)
- **Previous version (v1)**: [anicka/cve-backport-codegen-qwen25-32b-v1](https://huggingface.co/anicka/cve-backport-codegen-qwen25-32b-v1)

## Citation

```bibtex
@misc{cve-backport-codegen-v5,
  title={CVE Backport Codegen v5: Fine-tuned Qwen2.5-Coder-32B for Security Patch Backporting},
  author={Anna Maresova},
  year={2026},
  url={https://huggingface.co/anicka/cve-backport-codegen-v5-qwen25-32b}
}

Downloads last month: 22

GGUF

Model size

33B params

Architecture

qwen2

Hardware compatibility

8-bit

Model tree for anicka/cve-backport-codegen-v5-qwen25-32b

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-Coder-32B

Finetuned

Qwen/Qwen2.5-Coder-32B-Instruct

Adapter

(71)

this model

Dataset used to train anicka/cve-backport-codegen-v5-qwen25-32b

Evaluation results

Recall on CVE Backport Codegen Dataset
self-reported

0.931
Precision on CVE Backport Codegen Dataset
self-reported

0.944
Exact Match on CVE Backport Codegen Dataset
self-reported

0.830