openSUSE CVE Backport LoRA v1

A LoRA adapter for nvidia/Nemotron-Mini-4B-Instruct fine-tuned on openSUSE security patch backporting data.

⚠️ Status: Experimental (V1)

This is an experimental v1 model. Testing showed that while the model learned patch format and structure, it tends to hallucinate content rather than produce correct backports.

Key Findings from V1 Testing

✅ Model learned unified diff syntax and structure
✅ Model can complete partial patches in correct format
❌ Model hallucinates content instead of reasoning about the actual fix
❌ Without CVE context, model doesn't understand what to fix
❌ Without code context, model doesn't understand where to fix

V2 Development

We are developing a V2 approach that provides the model with the same information humans have when backporting:

CVE Description - What the vulnerability is
Upstream Fix - The original patch from upstream
Target Code Context - The actual functions being modified

See the opensuse-cve-backport-dataset for more information.

Model Details

Base Model: nvidia/Nemotron-Mini-4B-Instruct (4.3B parameters)
LoRA Rank: 64, Alpha: 128
Trainable Parameters: ~92M (2.1%)
Training Data: ~5,600 security patch examples
Training Time: ~38 hours on NVIDIA H100

Training Data

Trained on maintenance incident patches from openSUSE Build Service (OBS), covering:

openssl, openssh, curl, glibc
libxml2, ImageMagick, GraphicsMagick
bind, nginx, apache2, systemd
And 80+ other userspace C packages

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "nvidia/Nemotron-Mini-4B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("nvidia/Nemotron-Mini-4B-Instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "anicka/opensuse-backport-lora-v1")

# Generate
prompt = "..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Limitations

V1 model learns format but not reasoning
Best used as a starting point, not for production
Requires human review of all generated patches
Does not replace security expertise

License

Apache 2.0

Citation

If you use this model, please cite:

@misc{opensuse-backport-lora-v1,
  author = {anicka},
  title = {openSUSE CVE Backport LoRA v1},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/anicka/opensuse-backport-lora-v1}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for anicka/opensuse-backport-lora-v1

Base model

nvidia/Nemotron-Mini-4B-Instruct

Adapter

(1)

this model

anicka
/

opensuse-backport-lora-v1