File size: 7,513 Bytes
91b43e7 bb854a3 91b43e7 bb854a3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | ---
license: cc-by-nc-sa-4.0
language:
- en
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- code
- security
- python
- lora
- qlora
- fine-tuned
- cybersecurity
- secure-coding
- vulnerability
- cwe
- peft
- transformers
task_categories:
- text-generation
task_ids:
- language-modeling
---
# SecuCoder
SecuCoder is a fine-tuned version of [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) trained to generate secure Python code and remediate security vulnerabilities. It is part of a research pipeline that combines supervised fine-tuning (SFT), structured prompting, and retrieval-augmented generation (RAG) to reduce the number of vulnerabilities in automatically generated Python code.
---
## Model Description
| Field | Details |
|---|---|
| **Base model** | `meta-llama/Llama-3.1-8B-Instruct` |
| **Fine-tuning method** | QLoRA (NF4 4-bit) + LoRA adapters |
| **Training dataset** | [SecuCoder Messages Corpus](https://huggingface.co/datasets/ivitopow/secucoder) |
| **Training examples** | 5,708 (train) + 317 (validation) |
| **Epochs** | 2 |
| **Format** | Merged safetensors (bfloat16) |
| **Language** | English |
| **Domain** | Python secure coding |
---
## Intended Use
SecuCoder is designed for:
- **Vulnerability remediation** — given a Python snippet with a security flaw, produce a corrected version.
- **Secure code generation** — generate Python code from a natural language specification, avoiding common weaknesses.
- **Vulnerability classification** — identify whether a Python snippet is secure or vulnerable.
The model has been evaluated against the untuned Llama 3.1 8B Instruct baseline using static analysis tools (Bandit + Semgrep) and shows meaningful improvement in security metrics.
> This model is intended for research and educational purposes. It should not be used as the sole security review mechanism in production systems.
---
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "ivitopow/secucoder"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{
"role": "system",
"content": "You are a secure Python assistant. Help identify, explain, and fix security issues in Python code. Prefer safe, practical, and production-ready solutions."
},
{
"role": "user",
"content": "Fix the security vulnerability in this Python code.\n\n```python\nname = request.args.get('name')\nresp = make_response(\"Your name is \" + name)\n```\n\nCWE: CWE-079"
}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
output = model.generate(
input_ids,
max_new_tokens=512,
temperature=0.1,
top_p=0.9,
do_sample=True,
)
response = tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
```
### Usage with Ollama
A quantized GGUF version (Q4_K_M, ~4.6 GB) is available at [`ivitopow/secucoder-GGUF`](https://huggingface.co/ivitopow/secucoder-GGUF):
```bash
ollama create secucoder -f Modelfile
ollama run secucoder
```
---
## Training Details
### Method
The model was trained using **QLoRA** (Quantized Low-Rank Adaptation): the base model is loaded in 4-bit NF4 precision via BitsAndBytes, and low-rank adapters are attached to all projection layers. After training, the adapters are merged back into the base model and saved as standard safetensors.
### LoRA Configuration
| Parameter | Value |
|---|---|
| Rank (`r`) | 16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
### Training Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 2 |
| Learning rate | 2e-4 |
| LR scheduler | Cosine with 3% warmup |
| Optimizer | `paged_adamw_8bit` |
| Gradient checkpointing | Enabled |
| Precision | bfloat16 compute, NF4 storage |
| Sequence length | 2048 tokens |
### Training Data
Trained on the [SecuCoder Messages Corpus](https://huggingface.co/datasets/ivitopow/secucoder), a dataset of 6,342 Python security examples in chat format covering:
- **Vulnerability fix** (`fix`) — 4,037 examples across 20+ CWE categories
- **Security conversations** (`conversation`) — 2,210 multi-turn examples
- **Vulnerability classification** (`classify`) — 52 examples
- **Secure code generation** (`prompt_to_code`) — 43 examples
---
## Evaluation
SecuCoder was evaluated as part of a 5-variant ablation study. Each variant adds one technique over the previous one:
| Variant | Technique | Overall Score |
|---|---|---|
| `llama31_8b` | Baseline (no fine-tuning) | 60.34 |
| `secucoder_v1` | + SFT (LoRA, FP16) | 60.43 |
| `secucoder_v1-q4` | + Q4_K_M quantization | 61.46 |
| `secucoder_v1-q4_prompting` | + Structured security prompt | 64.46 |
| `secucoder_v1-q4_prompting_rag` | + RAG (OWASP, CWE, Python docs) | **77.11** |
Overall score = mean `sample_score` over non-truncated samples (higher is better, max 100). The full SecuCoder system (`secucoder_v1-q4_prompting_rag`) achieves a **+27.8% improvement** over the untuned baseline.
### Evaluation Methodology
Generated code was scanned with **Bandit** and **Semgrep** using weighted severity scores:
```
penalty = Σ(bandit_high × 2.0 + bandit_medium × 1.25 + bandit_low × 0.75)
+ Σ(semgrep_error × 5.0 + semgrep_warning × 3.0 + semgrep_info × 1.0)
sample_score = 100 / (1 + 8 × penalty_per_loc)
```
Samples with invalid syntax score 0. Truncated samples are excluded from the overall score.
---
## Related Resources
| Resource | Link |
|---|---|
| Training dataset | [ivitopow/secucoder](https://huggingface.co/datasets/ivitopow/secucoder) |
| GGUF / Ollama version | [ivitopow/secucoder-GGUF](https://huggingface.co/ivitopow/secucoder-GGUF) |
| Base model | [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
---
## Limitations
- The model only covers **Python**. It has not been evaluated on other languages.
- Static analysis tools (Bandit, Semgrep) do not detect all vulnerability types. Logic flaws or runtime-dependent issues may not be caught.
- The model was fine-tuned on a specific set of CWE categories. It may underperform on vulnerability types not well represented in the training data.
- As with all generative models, outputs should be reviewed by a developer before use in production.
---
## License
This model is released under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
It is built on top of Llama 3.1, which is subject to [Meta's Llama 3 Community License](https://llama.meta.com/llama3/license/). Please review both licenses before use.
---
## Citation
```bibtex
@misc{secucoder2025,
title = {SecuCoder: Fine-tuning Llama 3.1 8B for Secure Python Code Generation},
author = {SecuCoder Project},
year = {2025},
url = {https://huggingface.co/ivitopow/secucoder},
note = {CC-BY-NC-SA-4.0}
}
``` |