| ---
|
| license: cc-by-nc-sa-4.0
|
| language:
|
| - en
|
| base_model: meta-llama/Llama-3.1-8B-Instruct
|
| tags:
|
| - code
|
| - security
|
| - python
|
| - lora
|
| - qlora
|
| - fine-tuned
|
| - cybersecurity
|
| - secure-coding
|
| - vulnerability
|
| - cwe
|
| - peft
|
| - transformers
|
| task_categories:
|
| - text-generation
|
| task_ids:
|
| - language-modeling
|
| ---
|
|
|
| # SecuCoder
|
|
|
| SecuCoder is a fine-tuned version of [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) trained to generate secure Python code and remediate security vulnerabilities. It is part of a research pipeline that combines supervised fine-tuning (SFT), structured prompting, and retrieval-augmented generation (RAG) to reduce the number of vulnerabilities in automatically generated Python code.
|
|
|
| ---
|
|
|
| ## Model Description
|
|
|
| | Field | Details |
|
| |---|---|
|
| | **Base model** | `meta-llama/Llama-3.1-8B-Instruct` |
|
| | **Fine-tuning method** | QLoRA (NF4 4-bit) + LoRA adapters |
|
| | **Training dataset** | [SecuCoder Messages Corpus](https://huggingface.co/datasets/ivitopow/secucoder) |
|
| | **Training examples** | 5,708 (train) + 317 (validation) |
|
| | **Epochs** | 2 |
|
| | **Format** | Merged safetensors (bfloat16) |
|
| | **Language** | English |
|
| | **Domain** | Python secure coding |
|
|
|
| ---
|
|
|
| ## Intended Use
|
|
|
| SecuCoder is designed for:
|
|
|
| - **Vulnerability remediation** — given a Python snippet with a security flaw, produce a corrected version.
|
| - **Secure code generation** — generate Python code from a natural language specification, avoiding common weaknesses.
|
| - **Vulnerability classification** — identify whether a Python snippet is secure or vulnerable.
|
|
|
| The model has been evaluated against the untuned Llama 3.1 8B Instruct baseline using static analysis tools (Bandit + Semgrep) and shows meaningful improvement in security metrics.
|
|
|
| > This model is intended for research and educational purposes. It should not be used as the sole security review mechanism in production systems.
|
|
|
| ---
|
|
|
| ## Usage
|
|
|
| ```python
|
| from transformers import AutoModelForCausalLM, AutoTokenizer
|
| import torch
|
|
|
| model_id = "ivitopow/secucoder"
|
|
|
| tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| model = AutoModelForCausalLM.from_pretrained(
|
| model_id,
|
| torch_dtype=torch.bfloat16,
|
| device_map="auto",
|
| )
|
|
|
| messages = [
|
| {
|
| "role": "system",
|
| "content": "You are a secure Python assistant. Help identify, explain, and fix security issues in Python code. Prefer safe, practical, and production-ready solutions."
|
| },
|
| {
|
| "role": "user",
|
| "content": "Fix the security vulnerability in this Python code.\n\n```python\nname = request.args.get('name')\nresp = make_response(\"Your name is \" + name)\n```\n\nCWE: CWE-079"
|
| }
|
| ]
|
|
|
| input_ids = tokenizer.apply_chat_template(
|
| messages,
|
| tokenize=True,
|
| add_generation_prompt=True,
|
| return_tensors="pt"
|
| ).to(model.device)
|
|
|
| output = model.generate(
|
| input_ids,
|
| max_new_tokens=512,
|
| temperature=0.1,
|
| top_p=0.9,
|
| do_sample=True,
|
| )
|
|
|
| response = tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True)
|
| print(response)
|
| ```
|
|
|
| ### Usage with Ollama
|
|
|
| A quantized GGUF version (Q4_K_M, ~4.6 GB) is available at [`ivitopow/secucoder-GGUF`](https://huggingface.co/ivitopow/secucoder-GGUF):
|
|
|
| ```bash
|
| ollama create secucoder -f Modelfile
|
| ollama run secucoder
|
| ```
|
|
|
| ---
|
|
|
| ## Training Details
|
|
|
| ### Method
|
|
|
| The model was trained using **QLoRA** (Quantized Low-Rank Adaptation): the base model is loaded in 4-bit NF4 precision via BitsAndBytes, and low-rank adapters are attached to all projection layers. After training, the adapters are merged back into the base model and saved as standard safetensors.
|
|
|
| ### LoRA Configuration
|
|
|
| | Parameter | Value |
|
| |---|---|
|
| | Rank (`r`) | 16 |
|
| | Alpha | 32 |
|
| | Dropout | 0.05 |
|
| | Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
|
|
|
| ### Training Hyperparameters
|
|
|
| | Parameter | Value |
|
| |---|---|
|
| | Epochs | 2 |
|
| | Learning rate | 2e-4 |
|
| | LR scheduler | Cosine with 3% warmup |
|
| | Optimizer | `paged_adamw_8bit` |
|
| | Gradient checkpointing | Enabled |
|
| | Precision | bfloat16 compute, NF4 storage |
|
| | Sequence length | 2048 tokens |
|
|
|
| ### Training Data
|
|
|
| Trained on the [SecuCoder Messages Corpus](https://huggingface.co/datasets/ivitopow/secucoder), a dataset of 6,342 Python security examples in chat format covering:
|
|
|
| - **Vulnerability fix** (`fix`) — 4,037 examples across 20+ CWE categories
|
| - **Security conversations** (`conversation`) — 2,210 multi-turn examples
|
| - **Vulnerability classification** (`classify`) — 52 examples
|
| - **Secure code generation** (`prompt_to_code`) — 43 examples
|
|
|
| ---
|
|
|
| ## Evaluation
|
|
|
| SecuCoder was evaluated as part of a 5-variant ablation study. Each variant adds one technique over the previous one:
|
|
|
| | Variant | Technique | Overall Score |
|
| |---|---|---|
|
| | `llama31_8b` | Baseline (no fine-tuning) | 60.34 |
|
| | `secucoder_v1` | + SFT (LoRA, FP16) | 60.43 |
|
| | `secucoder_v1-q4` | + Q4_K_M quantization | 61.46 |
|
| | `secucoder_v1-q4_prompting` | + Structured security prompt | 64.46 |
|
| | `secucoder_v1-q4_prompting_rag` | + RAG (OWASP, CWE, Python docs) | **77.11** |
|
|
|
| Overall score = mean `sample_score` over non-truncated samples (higher is better, max 100). The full SecuCoder system (`secucoder_v1-q4_prompting_rag`) achieves a **+27.8% improvement** over the untuned baseline.
|
|
|
| ### Evaluation Methodology
|
|
|
| Generated code was scanned with **Bandit** and **Semgrep** using weighted severity scores:
|
|
|
| ```
|
| penalty = Σ(bandit_high × 2.0 + bandit_medium × 1.25 + bandit_low × 0.75)
|
| + Σ(semgrep_error × 5.0 + semgrep_warning × 3.0 + semgrep_info × 1.0)
|
|
|
| sample_score = 100 / (1 + 8 × penalty_per_loc)
|
| ```
|
|
|
| Samples with invalid syntax score 0. Truncated samples are excluded from the overall score.
|
|
|
| ---
|
|
|
| ## Related Resources
|
|
|
| | Resource | Link |
|
| |---|---|
|
| | Training dataset | [ivitopow/secucoder](https://huggingface.co/datasets/ivitopow/secucoder) |
|
| | GGUF / Ollama version | [ivitopow/secucoder-GGUF](https://huggingface.co/ivitopow/secucoder-GGUF) |
|
| | Base model | [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
|
|
|
| ---
|
|
|
| ## Limitations
|
|
|
| - The model only covers **Python**. It has not been evaluated on other languages.
|
| - Static analysis tools (Bandit, Semgrep) do not detect all vulnerability types. Logic flaws or runtime-dependent issues may not be caught.
|
| - The model was fine-tuned on a specific set of CWE categories. It may underperform on vulnerability types not well represented in the training data.
|
| - As with all generative models, outputs should be reviewed by a developer before use in production.
|
|
|
| ---
|
|
|
| ## License
|
|
|
| This model is released under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
|
|
|
| It is built on top of Llama 3.1, which is subject to [Meta's Llama 3 Community License](https://llama.meta.com/llama3/license/). Please review both licenses before use.
|
|
|
| ---
|
|
|
| ## Citation
|
|
|
| ```bibtex
|
| @misc{secucoder2025,
|
| title = {SecuCoder: Fine-tuning Llama 3.1 8B for Secure Python Code Generation},
|
| author = {SecuCoder Project},
|
| year = {2025},
|
| url = {https://huggingface.co/ivitopow/secucoder},
|
| note = {CC-BY-NC-SA-4.0}
|
| }
|
| ``` |