Instructions to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="openSUSE/CVE-Backport-Qwen2.5-Coder-32B", filename="cve-backport-codegen-v3-q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0 # Run inference directly in the terminal: llama-cli -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0 # Run inference directly in the terminal: llama-cli -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Use Docker
docker model run hf.co/openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
- LM Studio
- Jan
- vLLM
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "openSUSE/CVE-Backport-Qwen2.5-Coder-32B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openSUSE/CVE-Backport-Qwen2.5-Coder-32B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
- Ollama
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with Ollama:
ollama run hf.co/openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
- Unsloth Studio new
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for openSUSE/CVE-Backport-Qwen2.5-Coder-32B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for openSUSE/CVE-Backport-Qwen2.5-Coder-32B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for openSUSE/CVE-Backport-Qwen2.5-Coder-32B to start chatting
- Pi new
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Run Hermes
hermes
- Docker Model Runner
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with Docker Model Runner:
docker model run hf.co/openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
- Lemonade
How to use openSUSE/CVE-Backport-Qwen2.5-Coder-32B with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull openSUSE/CVE-Backport-Qwen2.5-Coder-32B:Q8_0
Run and chat with the model
lemonade run user.CVE-Backport-Qwen2.5-Coder-32B-Q8_0
List all available models
lemonade list
CVE Backport Code Generation — Qwen2.5-Coder-32B (v5)
Fine-tuned Qwen2.5-Coder-32B-Instruct for security patch backporting via per-hunk code generation. Maintained as part of the openSUSE security tooling effort, alongside the cve-backport-tool CLI.
Instead of generating unified diffs, this model takes a vulnerable code region and a fix description, and outputs the fixed version of the code. A programmatic diff then produces the final patch.
MoE variant available: An MoE-based alternative built on Qwen3-Coder-30B-A3B (3B active parameters) is hosted at anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b. It scores 91.9% recall on the same 100-example eval — 1.2 pt below this dense model — while running roughly 10× faster at inference due to sparse MoE activation. Recommended for bulk CVE backport workflows where throughput matters.
Quick Start
git clone https://github.com/openSUSE/cve-backport-tool
cd cve-backport-tool
./setup.sh # downloads GGUF, registers with ollama
python3 cve-backport.py \
--cve CVE-2024-1234 \
--package curl \
--patch upstream-fix.patch \
--obs-fetch --obs-project openSUSE:Leap:15.6:Update \
--retry 3
GGUF Downloads
| File | Quant | Size | Notes |
|---|---|---|---|
cve-backport-codegen-v5-q8_0.gguf |
Q8_0 | 33 GB | Recommended (v5, 93.1% recall, 94.4% precision, codegen-only) |
cve-backport-codegen-v4-q8_0.gguf |
Q8_0 | 33 GB | v4, 93% recall, 95% precision (includes test generation training) |
cve-backport-codegen-v3-q8_0.gguf |
Q8_0 | 33 GB | v3, 94% recall, 98% precision (legacy, smaller eval set) |
Evaluation (v5)
Per-hunk evaluation on 100 held-out examples the model never saw during training:
| Metric | v5 | v4 | v3 (n=20) |
|---|---|---|---|
| Average recall | 93.1% | 93% | 94% |
| Average precision | 94.4% | 95% | 98% |
| Exact match | 83/100 | 87/100 | 16/20 |
| Failures (<10%) | 3/100 | 4/100 | 0/20 |
By tier:
- Identical (upstream patch applies directly): 93.7% recall (77/85 perfect)
- Adapted (line numbers/context differ): 90.0% recall (13/15 perfect)
Adapted-tier recall has steadily improved: 71% (v1) → 86% (v4) → 90% (v5).
What changed in v5
v5 uses a codegen-only dataset — all 36,166 training examples follow the same 3-turn format. v4 mixed in 772 five-turn test-generation examples which diluted codegen focus. Dropping those and training for 2 epochs (vs 1 in v4) improved adapted-tier recall.
Comparison with Frontier Models
Same eval, same 100 examples, optimized prompts with markdown stripping:
| Model | Recall | Precision | Exact | Failures |
|---|---|---|---|---|
| CVE Backport v5 (32B fine-tuned) | 93% | 94% | 83/100 | 3 |
| Gemini 3.1 Pro (frontier, zero-shot) | 27% | 24% | 10/100 | 50 |
| Gemini 2.0 Flash (frontier, zero-shot) | 13% | 17% | 4/100 | 81 |
Fine-tuning on 36K domain-specific examples outperforms frontier models by 3-7x on this task.
Prompt Format
ChatML format. Each prompt covers one hunk region with 15 lines of context padding.
Code Generation (3-turn)
System:
You are a security patch backporting assistant.
Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.
Rules:
- Output ONLY the fixed code, nothing else — no explanations, no markdown fences
- Preserve exact formatting, indentation, and style of the original
- Make ONLY the changes described in the fix — do not modify anything else
- Do not add comments about what you changed
User:
## File: crypto/bn/bn.h
## Lines: 280-310
\```c
/* vulnerable source code region with 15 lines of context */
\```
## Fix
Add bounds check for BN_num_bits to prevent buffer over-read (CVE-2024-XXXX).
Assistant: The fixed version of the code region (just the code, no markup).
Training
| Base model | Qwen2.5-Coder-32B-Instruct |
| Method | QLoRA (4-bit NF4, bf16 compute, double quantization) |
| LoRA rank / alpha | 64 / 128 |
| Epochs | 2 (8,228 steps) |
| Training data | 36,166 train / 1,834 eval (codegen-only, all 3-turn) |
| Effective batch size | 8 |
| Learning rate | 1e-4 (cosine, 5% warmup) |
| Max sequence length | 4,096 tokens |
| Hardware | 2× NVIDIA H100 NVL 94GB |
| Training time | 46.1 hours |
| Final eval loss | 0.00602 |
Reproduction via Teapot
This model is reproducible via the teapot training pipeline. Once the dataset is composed, training is a four-command sequence:
git clone https://github.com/anicka-net/teapot
cd teapot
pip install -e .
# 1. Compose training data from the cve-backport module
teapot compose configs/cve-backport.config \
--output train-cve-backport.jsonl
# 2. Generate the QLoRA-HF launch script
teapot train configs/cve-backport.config \
--backend qlora-hf \
--train-data train-cve-backport.jsonl \
--eval-data eval-cve-backport.jsonl \
--output train-cve-backport.sh
# 3. Train (2× H100 NVL 94GB; ~46 hours)
bash train-cve-backport.sh
# 4. Final adapter is at output-teapot-cve-backport/final/
The teapot config (configs/cve-backport.config) pins all the hyperparameters listed in the Training table above. The qlora-hf backend invokes teapot.train_qlora_hf, a thin wrapper over the HuggingFace Trainer with bitsandbytes 4-bit quantization and PEFT LoRA.
LoRA Adapter and MoE Variant
The LoRA adapter for this model is hosted at anicka/cve-backport-codegen-v5-qwen25-32b for use with PEFT/transformers.
An MoE variant trained on the same dataset is available at anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b — built on Qwen3-Coder-30B-A3B (3B active params), 91.9% recall on the same n=100 eval, ~10× faster inference.
Known Issues
- The 3 failure cases (0% recall) are all complex libvirt patches involving multi-function adaptations across large files with significant structural differences. These likely require an agentic approach with source tree context.
- Very long hunks (>2000 tokens) may be truncated due to the 4096-token training context.
- Always review generated patches before applying to production systems.
License
Apache-2.0 (inherited from Qwen2.5-Coder-32B-Instruct).
- Downloads last month
- 36
8-bit
Model tree for openSUSE/CVE-Backport-Qwen2.5-Coder-32B
Base model
Qwen/Qwen2.5-32B