SecuCoder
SecuCoder is a fine-tuned version of Llama 3.1 8B Instruct trained to generate secure Python code and remediate security vulnerabilities. It is part of a research pipeline that combines supervised fine-tuning (SFT), structured prompting, and retrieval-augmented generation (RAG) to reduce the number of vulnerabilities in automatically generated Python code.
Model Description
| Field | Details |
|---|---|
| Base model | meta-llama/Llama-3.1-8B-Instruct |
| Fine-tuning method | QLoRA (NF4 4-bit) + LoRA adapters |
| Training dataset | SecuCoder Messages Corpus |
| Training examples | 5,708 (train) + 317 (validation) |
| Epochs | 2 |
| Format | Merged safetensors (bfloat16) |
| Language | English |
| Domain | Python secure coding |
Intended Use
SecuCoder is designed for:
- Vulnerability remediation — given a Python snippet with a security flaw, produce a corrected version.
- Secure code generation — generate Python code from a natural language specification, avoiding common weaknesses.
- Vulnerability classification — identify whether a Python snippet is secure or vulnerable.
The model has been evaluated against the untuned Llama 3.1 8B Instruct baseline using static analysis tools (Bandit + Semgrep) and shows meaningful improvement in security metrics.
This model is intended for research and educational purposes. It should not be used as the sole security review mechanism in production systems.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "ivitopow/secucoder"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{
"role": "system",
"content": "You are a secure Python assistant. Help identify, explain, and fix security issues in Python code. Prefer safe, practical, and production-ready solutions."
},
{
"role": "user",
"content": "Fix the security vulnerability in this Python code.\n\n```python\nname = request.args.get('name')\nresp = make_response(\"Your name is \" + name)\n```\n\nCWE: CWE-079"
}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
output = model.generate(
input_ids,
max_new_tokens=512,
temperature=0.1,
top_p=0.9,
do_sample=True,
)
response = tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
Usage with Ollama
A quantized GGUF version (Q4_K_M, ~4.6 GB) is available at ivitopow/secucoder-GGUF:
ollama create secucoder -f Modelfile
ollama run secucoder
Training Details
Method
The model was trained using QLoRA (Quantized Low-Rank Adaptation): the base model is loaded in 4-bit NF4 precision via BitsAndBytes, and low-rank adapters are attached to all projection layers. After training, the adapters are merged back into the base model and saved as standard safetensors.
LoRA Configuration
| Parameter | Value |
|---|---|
Rank (r) |
16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
Training Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 2 |
| Learning rate | 2e-4 |
| LR scheduler | Cosine with 3% warmup |
| Optimizer | paged_adamw_8bit |
| Gradient checkpointing | Enabled |
| Precision | bfloat16 compute, NF4 storage |
| Sequence length | 2048 tokens |
Training Data
Trained on the SecuCoder Messages Corpus, a dataset of 6,342 Python security examples in chat format covering:
- Vulnerability fix (
fix) — 4,037 examples across 20+ CWE categories - Security conversations (
conversation) — 2,210 multi-turn examples - Vulnerability classification (
classify) — 52 examples - Secure code generation (
prompt_to_code) — 43 examples
Evaluation
SecuCoder was evaluated as part of a 5-variant ablation study. Each variant adds one technique over the previous one:
| Variant | Technique | Overall Score |
|---|---|---|
llama31_8b |
Baseline (no fine-tuning) | 60.34 |
secucoder_v1 |
+ SFT (LoRA, FP16) | 60.43 |
secucoder_v1-q4 |
+ Q4_K_M quantization | 61.46 |
secucoder_v1-q4_prompting |
+ Structured security prompt | 64.46 |
secucoder_v1-q4_prompting_rag |
+ RAG (OWASP, CWE, Python docs) | 77.11 |
Overall score = mean sample_score over non-truncated samples (higher is better, max 100). The full SecuCoder system (secucoder_v1-q4_prompting_rag) achieves a +27.8% improvement over the untuned baseline.
Evaluation Methodology
Generated code was scanned with Bandit and Semgrep using weighted severity scores:
penalty = Σ(bandit_high × 2.0 + bandit_medium × 1.25 + bandit_low × 0.75)
+ Σ(semgrep_error × 5.0 + semgrep_warning × 3.0 + semgrep_info × 1.0)
sample_score = 100 / (1 + 8 × penalty_per_loc)
Samples with invalid syntax score 0. Truncated samples are excluded from the overall score.
Related Resources
| Resource | Link |
|---|---|
| Training dataset | ivitopow/secucoder |
| GGUF / Ollama version | ivitopow/secucoder-GGUF |
| Base model | meta-llama/Llama-3.1-8B-Instruct |
Limitations
- The model only covers Python. It has not been evaluated on other languages.
- Static analysis tools (Bandit, Semgrep) do not detect all vulnerability types. Logic flaws or runtime-dependent issues may not be caught.
- The model was fine-tuned on a specific set of CWE categories. It may underperform on vulnerability types not well represented in the training data.
- As with all generative models, outputs should be reviewed by a developer before use in production.
License
This model is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
It is built on top of Llama 3.1, which is subject to Meta's Llama 3 Community License. Please review both licenses before use.
Citation
@misc{secucoder2025,
title = {SecuCoder: Fine-tuning Llama 3.1 8B for Secure Python Code Generation},
author = {SecuCoder Project},
year = {2025},
url = {https://huggingface.co/ivitopow/secucoder},
note = {CC-BY-NC-SA-4.0}
}
- Downloads last month
- -