openSUSE CVE Backport LoRA v1

A LoRA adapter for nvidia/Nemotron-Mini-4B-Instruct fine-tuned on openSUSE security patch backporting data.

⚠️ Status: Experimental (V1)

This is an experimental v1 model. Testing showed that while the model learned patch format and structure, it tends to hallucinate content rather than produce correct backports.

Key Findings from V1 Testing

  • ✅ Model learned unified diff syntax and structure
  • ✅ Model can complete partial patches in correct format
  • ❌ Model hallucinates content instead of reasoning about the actual fix
  • ❌ Without CVE context, model doesn't understand what to fix
  • ❌ Without code context, model doesn't understand where to fix

V2 Development

We are developing a V2 approach that provides the model with the same information humans have when backporting:

  1. CVE Description - What the vulnerability is
  2. Upstream Fix - The original patch from upstream
  3. Target Code Context - The actual functions being modified

See the opensuse-cve-backport-dataset for more information.

Model Details

  • Base Model: nvidia/Nemotron-Mini-4B-Instruct (4.3B parameters)
  • LoRA Rank: 64, Alpha: 128
  • Trainable Parameters: ~92M (2.1%)
  • Training Data: ~5,600 security patch examples
  • Training Time: ~38 hours on NVIDIA H100

Training Data

Trained on maintenance incident patches from openSUSE Build Service (OBS), covering:

  • openssl, openssh, curl, glibc
  • libxml2, ImageMagick, GraphicsMagick
  • bind, nginx, apache2, systemd
  • And 80+ other userspace C packages

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "nvidia/Nemotron-Mini-4B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("nvidia/Nemotron-Mini-4B-Instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "anicka/opensuse-backport-lora-v1")

# Generate
prompt = "..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Limitations

  • V1 model learns format but not reasoning
  • Best used as a starting point, not for production
  • Requires human review of all generated patches
  • Does not replace security expertise

License

Apache 2.0

Citation

If you use this model, please cite:

@misc{opensuse-backport-lora-v1,
  author = {anicka},
  title = {openSUSE CVE Backport LoRA v1},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/anicka/opensuse-backport-lora-v1}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anicka/opensuse-backport-lora-v1

Adapter
(1)
this model

Dataset used to train anicka/opensuse-backport-lora-v1