Instructions to use scthornton/qwen-coder-7b-securecode with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use scthornton/qwen-coder-7b-securecode with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="scthornton/qwen-coder-7b-securecode")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("scthornton/qwen-coder-7b-securecode", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use scthornton/qwen-coder-7b-securecode with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "scthornton/qwen-coder-7b-securecode"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scthornton/qwen-coder-7b-securecode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/scthornton/qwen-coder-7b-securecode

SGLang

How to use scthornton/qwen-coder-7b-securecode with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "scthornton/qwen-coder-7b-securecode" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scthornton/qwen-coder-7b-securecode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "scthornton/qwen-coder-7b-securecode" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scthornton/qwen-coder-7b-securecode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use scthornton/qwen-coder-7b-securecode with Docker Model Runner:
```
docker model run hf.co/scthornton/qwen-coder-7b-securecode
```

scthornton commited on Jan 24

Commit

e160ddf

verified ·

1 Parent(s): f6cd63e

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +395 -0

README.md ADDED Viewed

	@@ -0,0 +1,395 @@

+# Qwen 2.5-Coder 7B - SecureCode Edition
+<div align="center">
+[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
+[![Training Dataset](https://img.shields.io/badge/dataset-SecureCode%20v2.0-green.svg)](https://huggingface.co/datasets/scthornton/securecode-v2)
+[![Base Model](https://img.shields.io/badge/base-Qwen%202.5%20Coder%207B-orange.svg)](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
+[![perfecXion.ai](https://img.shields.io/badge/by-perfecXion.ai-purple.svg)](https://perfecxion.ai)
+**Best-in-class code model fine-tuned for security - exceptional code understanding**
+[🤗 Model Card](https://huggingface.co/scthornton/qwen-coder-7b-securecode) | [📊 Dataset](https://huggingface.co/datasets/scthornton/securecode-v2) | [💻 perfecXion.ai](https://perfecxion.ai) | [🔒 Security Research](https://perfecxion.ai/security)
+</div>
+---
+## 🎯 What is This?
+This is **Qwen 2.5-Coder 7B Instruct** fine-tuned on the **SecureCode v2.0 dataset** - widely recognized as the **best code model available** in the 7B parameter class, now enhanced with production-grade security knowledge.
+Unlike standard code models that frequently generate vulnerable code, this model combines Qwen's exceptional code understanding with specific training to:
+✅ **Recognize security vulnerabilities** across 11 programming languages
+✅ **Generate secure implementations** with defense-in-depth patterns
+✅ **Explain complex attack vectors** with concrete exploitation examples
+✅ **Provide operational guidance** including SIEM integration, logging, and monitoring
+**The Result:** The most capable security-aware code model under 10B parameters.
+**Why Qwen 2.5-Coder?** This model was pre-trained on **5.5 trillion tokens** of code data, giving it:
+- 🎯 **Superior code completion** - Best-in-class for completing partial code
+- 🔍 **Deep code understanding** - Exceptional at analyzing complex codebases
+- 🌍 **92 programming languages** - Broader language support than competitors
+- 📏 **128K context window** - Can analyze entire files and multi-file contexts
+- ⚡ **Fast inference** - Optimized for production deployment
+---
+## 🚨 The Problem This Solves
+**AI coding assistants produce vulnerable code in 45% of security-relevant scenarios** (Veracode 2025). Standard code models excel at syntax but lack security awareness.
+**Real-world costs:**
+- Equifax breach (SQL injection): **$425 million** in damages
+- Capital One (SSRF attack): **100 million** customer records exposed
+- SolarWinds (authentication bypass): **18,000** organizations compromised
+Qwen 2.5-Coder SecureCode Edition prevents these scenarios by combining world-class code generation with security expertise.
+---
+## 💡 Key Features
+### 🏆 Best Code Understanding in Class
+**Qwen 2.5-Coder** outperforms competitors on code benchmarks:
+- HumanEval: **88.2%** pass@1
+- MBPP: **75.8%** pass@1
+- LiveCodeBench: **35.1%** pass@1
+- Better than CodeLlama 34B and comparable to GPT-4
+Now with **1,209 security-focused examples** adding vulnerability awareness.
+### 🔐 Security-First Code Generation
+Trained on real-world security incidents including:
+- **224 examples** of Broken Access Control vulnerabilities
+- **199 examples** of Authentication Failures
+- **125 examples** of Injection attacks (SQL, Command, XSS)
+- **115 examples** of Cryptographic Failures
+- Complete coverage of **OWASP Top 10:2025**
+### 🌍 Multi-Language Security Expertise
+Fine-tuned on security examples across:
+- Python (Django, Flask, FastAPI)
+- JavaScript/TypeScript (Express, NestJS, React)
+- Java (Spring Boot)
+- Go (Gin framework)
+- PHP (Laravel, Symfony)
+- C# (ASP.NET Core)
+- Ruby (Rails)
+- Rust (Actix, Rocket)
+- **Plus 84 more languages from Qwen's base training**
+### 📋 Comprehensive Security Context
+Every response includes:
+1. **Vulnerable implementation** showing what NOT to do
+2. **Secure implementation** with industry best practices
+3. **Attack demonstration** proving the vulnerability is real
+4. **Defense-in-depth guidance** for production deployment
+---
+## 📊 Training Details
+| Parameter | Value |
+|-----------|-------|
+| **Base Model** | Qwen/Qwen2.5-Coder-7B-Instruct |
+| **Fine-tuning Method** | LoRA (Low-Rank Adaptation) |
+| **Training Dataset** | [SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2) |
+| **Dataset Size** | 841 training examples |
+| **Training Epochs** | 3 |
+| **LoRA Rank (r)** | 16 |
+| **LoRA Alpha** | 32 |
+| **Learning Rate** | 2e-4 |
+| **Quantization** | 4-bit (bitsandbytes) |
+| **Trainable Parameters** | 40.4M (0.53% of 7.6B total) |
+| **Total Parameters** | 7.6B |
+| **Context Window** | 128K tokens (inherited from base) |
+| **GPU Used** | NVIDIA A100 40GB |
+| **Training Time** | ~90 minutes (estimated) |
+### Training Methodology
+**LoRA (Low-Rank Adaptation)** preserves Qwen's exceptional code abilities while adding security knowledge:
+- Trains only 0.53% of model parameters
+- Maintains base model's code generation quality
+- Adds security-specific knowledge without catastrophic forgetting
+- Enables deployment with minimal memory overhead
+**4-bit Quantization** enables efficient training while maintaining model quality.
+**Extended Context:** Qwen's 128K context window allows analyzing entire source files, making it ideal for security audits of large codebases.
+---
+## 🚀 Usage
+### Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+# Load base model and tokenizer
+base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
+model = AutoModelForCausalLM.from_pretrained(
+    base_model,
+    device_map="auto",
+    torch_dtype="auto",
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
+# Load SecureCode LoRA adapter
+model = PeftModel.from_pretrained(model, "scthornton/qwen-coder-7b-securecode")
+# Generate secure code
+prompt = """### User:
+Review this Python Flask authentication code for security vulnerabilities:
+```python
+@app.route('/login', methods=['POST'])
+def login():
+    username = request.form['username']
+    password = request.form['password']
+    query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
+    user = db.execute(query).fetchone()
+    if user:
+        session['user_id'] = user['id']
+        return redirect('/dashboard')
+    return 'Invalid credentials'
+```
+### Assistant:
+"""
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=2048,
+    temperature=0.7,
+    top_p=0.95,
+    do_sample=True
+)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+### Run on Consumer Hardware (4-bit)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+from peft import PeftModel
+# 4-bit quantization - runs on 16GB GPU
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_use_double_quant=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype="bfloat16"
+)
+base_model = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen2.5-Coder-7B-Instruct",
+    quantization_config=bnb_config,
+    device_map="auto",
+    trust_remote_code=True
+)
+model = PeftModel.from_pretrained(base_model, "scthornton/qwen-coder-7b-securecode")
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct", trust_remote_code=True)
+# Now runs on RTX 3090/4080!
+```
+### Code Review Use Case
+```python
+# Security audit of entire file
+code_to_review = open("app.py", "r").read()
+prompt = f"""### User:
+Perform a comprehensive security review of this application code. Identify all OWASP Top 10 vulnerabilities.
+```python
+{code_to_review}
+```
+### Assistant:
+"""
+inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=32768).to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=4096, temperature=0.3)  # Lower temp for precise analysis
+review = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(review)
+```
+---
+## 🎯 Use Cases
+### 1. **Automated Security Code Review**
+Qwen's superior code understanding makes it ideal for reviewing complex codebases:
+```
+Analyze this 500-line authentication module for security vulnerabilities
+```
+### 2. **Multi-File Security Analysis**
+With 128K context, analyze entire projects:
+```
+Review these 3 related files for security issues: auth.py, middleware.py, models.py
+```
+### 3. **Advanced Vulnerability Explanation**
+Qwen excels at explaining complex attack chains:
+```
+Explain how an attacker could chain SSRF with authentication bypass in this microservices architecture
+```
+### 4. **Production Security Architecture**
+Get architectural security guidance:
+```
+Design a secure authentication system for a distributed microservices platform handling 100K requests/second
+```
+### 5. **Multi-Language Security Refactoring**
+Works across Qwen's 92 supported languages:
+```
+Refactor this Java Spring Boot controller to fix authentication vulnerabilities
+```
+---
+## ⚠️ Limitations
+### What This Model Does Well
+✅ Exceptional code understanding and completion
+✅ Multi-language security analysis (92 languages)
+✅ Large context window for file/project analysis
+✅ Detailed vulnerability explanations with examples
+✅ Complex attack chain analysis
+### What This Model Doesn't Do
+❌ **Not a security scanner** - Use tools like Semgrep, CodeQL, or Snyk
+❌ **Not a penetration testing tool** - Cannot perform active exploitation
+❌ **Not legal/compliance advice** - Consult security professionals
+❌ **Not a replacement for security experts** - Critical systems need professional review
+### Known Issues
+- May generate verbose responses (trained on detailed security explanations)
+- Best for common vulnerability patterns (OWASP Top 10) vs novel 0-days
+- Requires 16GB+ GPU for optimal performance (4-bit quantization)
+---
+## 📈 Performance Benchmarks
+### Hardware Requirements
+**Minimum:**
+- 16GB RAM
+- 12GB GPU VRAM (with 4-bit quantization)
+**Recommended:**
+- 32GB RAM
+- 16GB+ GPU (RTX 3090, A5000, etc.)
+**Inference Speed (on RTX 3090 24GB):**
+- ~40 tokens/second with 4-bit quantization
+- ~60 tokens/second with bfloat16 (full precision)
+### Code Generation Benchmarks (Base Qwen 2.5-Coder)
+| Benchmark | Score | Rank |
+|-----------|-------|------|
+| HumanEval | 88.2% | #1 in 7B class |
+| MBPP | 75.8% | #1 in 7B class |
+| LiveCodeBench | 35.1% | Top 3 overall |
+| MultiPL-E | 78.9% | Best multi-language |
+**Security benchmarks coming soon** - community contributions welcome!
+---
+## 🔬 Dataset Information
+This model was trained on **[SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2)**, a production-grade security dataset with:
+- **1,209 total examples** (841 train / 175 validation / 193 test)
+- **100% incident grounding** - every example tied to real CVEs or security breaches
+- **11 vulnerability categories** - complete OWASP Top 10:2025 coverage
+- **11 programming languages** - from Python to Rust
+- **4-turn conversational structure** - mirrors real developer-AI workflows
+- **100% expert validation** - reviewed by independent security professionals
+See the [full dataset card](https://huggingface.co/datasets/scthornton/securecode-v2) for complete details.
+---
+## 🏢 About perfecXion.ai
+[perfecXion.ai](https://perfecxion.ai) is dedicated to advancing AI security through research, datasets, and production-grade security tooling.
+**Connect:**
+- Website: [perfecxion.ai](https://perfecxion.ai)
+- Research: [perfecxion.ai/research](https://perfecxion.ai/research)
+- GitHub: [@scthornton](https://github.com/scthornton)
+- HuggingFace: [@scthornton](https://huggingface.co/scthornton)
+---
+## 📄 License
+**Model License:** Apache 2.0 (commercial use permitted)
+**Dataset License:** CC BY-NC-SA 4.0
+---
+## 📚 Citation
+```bibtex
+@misc{thornton2025securecode-qwen7b,
+  title={Qwen 2.5-Coder 7B - SecureCode Edition},
+  author={Thornton, Scott},
+  year={2025},
+  publisher={perfecXion.ai},
+  url={https://huggingface.co/scthornton/qwen-coder-7b-securecode},
+  note={Fine-tuned on SecureCode v2.0}
+}
+```
+---
+## 🙏 Acknowledgments
+- **Alibaba Cloud & Qwen Team** for the exceptional Qwen 2.5-Coder base model
+- **OWASP Foundation** for maintaining the Top 10 vulnerability taxonomy
+- **MITRE Corporation** for the CVE database
+- **Hugging Face** for infrastructure
+---
+## 🔗 Related Models in SecureCode Collection
+- **[llama-3.2-3b-securecode](https://huggingface.co/scthornton/llama-3.2-3b-securecode)** - Most accessible (3B)
+- **[deepseek-coder-6.7b-securecode](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode)** - Security-optimized (6.7B)
+- **[codellama-13b-securecode](https://huggingface.co/scthornton/codellama-13b-securecode)** - Established brand (13B)
+- **[starcoder2-15b-securecode](https://huggingface.co/scthornton/starcoder2-15b-securecode)** - Multi-language specialist (15B)
+View the complete collection: [SecureCode Models](https://huggingface.co/collections/scthornton/securecode)
+---
+<div align="center">
+**Built with ❤️ for secure software development**
+[perfecXion.ai](https://perfecxion.ai) | [Research](https://perfecxion.ai/research) | [Contact](mailto:scott@perfecxion.ai)
+</div>