scthornton
/

deepseek-coder-6.7b-securecode

@@ -1,207 +1,207 @@
 ---
-license: other
 base_model: deepseek-ai/deepseek-coder-6.7b-instruct
-tags:
-  - security
-  - cybersecurity
-  - secure-coding
-  - ai-security
-  - owasp
-  - code-generation
-  - qlora
-  - lora
-  - fine-tuned
-  - securecode
-datasets:
-  - scthornton/securecode
 library_name: peft
 pipeline_tag: text-generation
-language:
-  - code
-  - en
 ---
-# DeepSeek Coder 6.7B SecureCode
-<div align="center">
-![Parameters](https://img.shields.io/badge/params-6.7B-blue.svg)
-![Dataset](https://img.shields.io/badge/dataset-2,185_examples-green.svg)
-![OWASP](https://img.shields.io/badge/OWASP-Top_10_2021_+_LLM_Top_10_2025-orange.svg)
-![Method](https://img.shields.io/badge/method-QLoRA_4--bit-purple.svg)
-**Security-specialized code model fine-tuned on the [SecureCode](https://huggingface.co/datasets/scthornton/securecode) dataset**
-[Dataset](https://huggingface.co/datasets/scthornton/securecode) | [Paper (arXiv:2512.18542)](https://arxiv.org/abs/2512.18542) | [Model Collection](https://huggingface.co/collections/scthornton/securecode) | [perfecXion.ai](https://perfecxion.ai)
-</div>
----
-## What This Model Does
-This model generates **secure code** when developers ask about building features. Instead of producing vulnerable implementations (like 45% of AI-generated code does), it:
-- Identifies the security risks in common coding patterns
-- Provides vulnerable *and* secure implementations side by side
-- Explains how attackers would exploit the vulnerability
-- Includes defense-in-depth guidance: logging, monitoring, SIEM integration, infrastructure hardening
-The model was fine-tuned on **2,185 security training examples** covering both traditional web security (OWASP Top 10 2021) and AI/ML security (OWASP LLM Top 10 2025).
-## Model Details
-| | |
-|---|---|
-| **Base Model** | [DeepSeek Coder 6.7B Instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) |
-| **Parameters** | 6.7B |
-| **Architecture** | DeepSeek |
-| **Tier** | Tier 2: Mid-size Code Specialist |
-| **Method** | QLoRA (4-bit NormalFloat quantization) |
-| **LoRA Rank** | 16 (alpha=32) |
-| **Target Modules** | `q_proj, k_proj, v_proj, o_proj` (4 modules) |
-| **Training Data** | [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) (2,185 examples) |
-| **Hardware** | NVIDIA A100 40GB |
-Strong code generation model with excellent fill-in-the-middle capabilities. Competitive with larger models on coding benchmarks.
-## Quick Start
-```python
-from peft import PeftModel
-from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
-import torch
-# Load with 4-bit quantization (matches training)
-bnb_config = BitsAndBytesConfig(
-    load_in_4bit=True,
-    bnb_4bit_quant_type="nf4",
-    bnb_4bit_compute_dtype=torch.bfloat16,
-)
-base_model = AutoModelForCausalLM.from_pretrained(
-    "deepseek-ai/deepseek-coder-6.7b-instruct",
-    quantization_config=bnb_config,
-    device_map="auto",
-)
-tokenizer = AutoTokenizer.from_pretrained("scthornton/deepseek-coder-6.7b-securecode")
-model = PeftModel.from_pretrained(base_model, "scthornton/deepseek-coder-6.7b-securecode")
-# Ask a security-relevant coding question
-messages = [
-    {"role": "user", "content": "How do I implement JWT authentication with refresh tokens in Python?"}
-]
-inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
-outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))
-```
 ## Training Details
-### Dataset
-Trained on the full **[SecureCode](https://huggingface.co/datasets/scthornton/securecode)** unified dataset:
-- **2,185 total examples** (1,435 web security + 750 AI/ML security)
-- **20 vulnerability categories** across OWASP Top 10 2021 and OWASP LLM Top 10 2025
-- **12+ programming languages** and **49+ frameworks**
-- **4-turn conversational structure**: feature request, vulnerable/secure implementations, advanced probing, operational guidance
-- **100% incident grounding**: every example tied to real CVEs, vendor advisories, or published attack research
-### Hyperparameters
-| Parameter | Value |
-|-----------|-------|
-| LoRA rank | 16 |
-| LoRA alpha | 32 |
-| LoRA dropout | 0.05 |
-| Target modules | 4 linear layers |
-| Quantization | 4-bit NormalFloat (NF4) |
-| Learning rate | 2e-4 |
-| LR scheduler | Cosine with 100-step warmup |
-| Epochs | 3 |
-| Per-device batch size | 2 |
-| Gradient accumulation | 8x |
-| Effective batch size | 16 |
-| Max sequence length | 4096 tokens |
-| Optimizer | paged_adamw_8bit |
-| Precision | bf16 |
-**Notes:** Compact LoRA targeting attention layers only (4 modules). Extended 4096-token context.
-## Security Coverage
-### Web Security (1,435 examples)
-OWASP Top 10 2021: Broken Access Control, Cryptographic Failures, Injection, Insecure Design, Security Misconfiguration, Vulnerable Components, Authentication Failures, Software Integrity Failures, Logging/Monitoring Failures, SSRF.
-Languages: Python, JavaScript, Java, Go, PHP, C#, TypeScript, Ruby, Rust, Kotlin, YAML.
-### AI/ML Security (750 examples)
-OWASP LLM Top 10 2025: Prompt Injection, Sensitive Information Disclosure, Supply Chain Vulnerabilities, Data/Model Poisoning, Improper Output Handling, Excessive Agency, System Prompt Leakage, Vector/Embedding Weaknesses, Misinformation, Unbounded Consumption.
-Frameworks: LangChain, OpenAI, Anthropic, HuggingFace, LlamaIndex, ChromaDB, Pinecone, FastAPI, Flask, vLLM, CrewAI, and 30+ more.
-## SecureCode Model Collection
-This model is part of the **SecureCode** collection of 8 security-specialized models:
-| Model | Base | Size | Tier | HuggingFace |
-|-------|------|------|------|-------------|
-| Llama 3.2 SecureCode | meta-llama/Llama-3.2-3B-Instruct | 3B | Accessible | [`llama-3.2-3b-securecode`](https://huggingface.co/scthornton/llama-3.2-3b-securecode) |
-| Qwen2.5 Coder SecureCode | Qwen/Qwen2.5-Coder-7B-Instruct | 7B | Mid-size | [`qwen2.5-coder-7b-securecode`](https://huggingface.co/scthornton/qwen2.5-coder-7b-securecode) |
-| DeepSeek Coder SecureCode | deepseek-ai/deepseek-coder-6.7b-instruct | 6.7B | Mid-size | [`deepseek-coder-6.7b-securecode`](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode) |
-| CodeGemma SecureCode | google/codegemma-7b-it | 7B | Mid-size | [`codegemma-7b-securecode`](https://huggingface.co/scthornton/codegemma-7b-securecode) |
-| CodeLlama SecureCode | codellama/CodeLlama-13b-Instruct-hf | 13B | Large | [`codellama-13b-securecode`](https://huggingface.co/scthornton/codellama-13b-securecode) |
-| Qwen2.5 Coder 14B SecureCode | Qwen/Qwen2.5-Coder-14B-Instruct | 14B | Large | [`qwen2.5-coder-14b-securecode`](https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode) |
-| StarCoder2 SecureCode | bigcode/starcoder2-15b-instruct-v0.1 | 15B | Large | [`starcoder2-15b-securecode`](https://huggingface.co/scthornton/starcoder2-15b-securecode) |
-| Granite 20B Code SecureCode | ibm-granite/granite-20b-code-instruct-8k | 20B | XL | [`granite-20b-code-securecode`](https://huggingface.co/scthornton/granite-20b-code-securecode) |
-Choose based on your deployment constraints: **3B** for edge/mobile, **7B** for general use, **13B-15B** for deeper reasoning, **20B** for maximum capability.
-## SecureCode Dataset Family
-| Dataset | Examples | Focus | Link |
-|---------|----------|-------|------|
-| **SecureCode** | 2,185 | Unified (web + AI/ML) | [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) |
-| SecureCode Web | 1,435 | Web security (OWASP Top 10 2021) | [scthornton/securecode-web](https://huggingface.co/datasets/scthornton/securecode-web) |
-| SecureCode AI/ML | 750 | AI/ML security (OWASP LLM Top 10 2025) | [scthornton/securecode-aiml](https://huggingface.co/datasets/scthornton/securecode-aiml) |
-## Intended Use
-**Use this model for:**
-- Training AI coding assistants to write secure code
-- Security education and training
-- Vulnerability research and secure code review
-- Building security-aware development tools
-**Do not use this model for:**
-- Offensive exploitation or automated attack generation
-- Circumventing security controls
-- Any activity that violates the base model's license
-## Citation
-```bibtex
-@misc{thornton2026securecode,
-  title={SecureCode: A Production-Grade Multi-Turn Dataset for Training Security-Aware Code Generation Models},
-  author={Thornton, Scott},
-  year={2026},
-  publisher={perfecXion.ai},
-  url={https://huggingface.co/datasets/scthornton/securecode},
-  note={arXiv:2512.18542}
-}
-```
-## Links
-- **Dataset**: [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode)
-- **Research Paper**: [arXiv:2512.18542](https://arxiv.org/abs/2512.18542)
-- **Model Collection**: [huggingface.co/collections/scthornton/securecode](https://huggingface.co/collections/scthornton/securecode)
-- **Author**: [perfecXion.ai](https://perfecxion.ai)
-## License
-This model is released under the **other** license (inherited from the base model). The training dataset ([SecureCode](https://huggingface.co/datasets/scthornton/securecode)) is licensed under **CC BY-NC-SA 4.0**.

 ---
 base_model: deepseek-ai/deepseek-coder-6.7b-instruct
 library_name: peft
 pipeline_tag: text-generation
+tags:
+- base_model:adapter:deepseek-ai/deepseek-coder-6.7b-instruct
+- lora
+- transformers
 ---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
 ## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.1