--- license: apache-2.0 base_model: Qwen/Qwen2.5-Coder-7B-Instruct tags: - security - code-review - vulnerability-detection - sast - false-positive-reduction - gguf - qwen2 - ollama language: - en pipeline_tag: text-generation model-index: - name: kon-security-v5 results: - task: type: text-generation name: Security Code Review metrics: - name: Accuracy type: accuracy value: 98.1 - name: F1 Score type: f1 value: 0.99 - name: False Positive Rate type: custom value: 0.0 - name: JSON Compliance type: custom value: 100.0 --- # kon-security-v5 **Expert Security Code Reviewer** - A fine-tuned Qwen2.5-Coder-7B model specialized for security vulnerability detection and false positive reduction in SAST (Static Application Security Testing) pipelines. ## Model Details | Property | Value | |----------|-------| | Base Model | [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | | Fine-tuning | QLoRA (4-bit quantization) | | Quantization | Q4_K_M (GGUF) | | Parameters | 7.6B | | Context Length | 32,768 tokens | | File Size | ~4.7 GB | | Format | GGUF (Ollama-compatible) | ## Performance | Metric | Score | |--------|-------| | Overall Accuracy | **98.1%** | | F1 Score | **0.99** | | False Positive Rate | **0.0%** | | JSON Compliance | **100%** | | Avg Response Time | **2.8s** | ## Capabilities - Identifies true security vulnerabilities across 20+ vulnerability categories - Eliminates false positives from SAST tools (SQL injection, XSS, command injection, etc.) - Provides structured JSON output with verdict, confidence, CWE IDs, severity, and remediation - Understands framework-specific safe patterns (React, Django, Express, Rails, etc.) - Supports taint analysis reasoning (source-to-sink tracking) ## Vulnerability Categories CRITICAL: SQL Injection (CWE-89), Command Injection (CWE-78), Deserialization (CWE-502), Hardcoded Secrets (CWE-798), Code Injection (CWE-94) HIGH: XSS (CWE-79), Path Traversal (CWE-22), SSRF (CWE-918), Timing Attacks (CWE-208), Buffer Overflow (CWE-120) MEDIUM: Weak Crypto (CWE-327), Insecure Random (CWE-330), Information Disclosure (CWE-200), Missing Auth (CWE-306) ## Usage with Ollama ```bash # Pull the model ollama pull kon-security/kon-security-v5 # Or create from GGUF ollama create kon-security-v5 -f Modelfile # Run ollama run kon-security-v5 ``` ### Example Prompt ``` <|im_start|>system You are an expert security code reviewer... <|im_end|> <|im_start|>user Analyze this code for SQL injection: query = f"SELECT * FROM users WHERE id = {user_id}" <|im_end|> <|im_start|>assistant ``` ### Example Response ```json { "verdict": "TRUE_POSITIVE", "is_vulnerable": true, "confidence": 0.97, "cwe_ids": ["CWE-89"], "severity": "CRITICAL", "reasoning": "f-string interpolates user_id directly into SQL query without parameterization", "remediation": "cursor.execute('SELECT * FROM users WHERE id = ?', (user_id,))" } ``` ## System Prompt The model is fine-tuned with the following system prompt baked in: ``` You are an expert security code reviewer specializing in identifying true vulnerabilities and eliminating false positives. You analyze code with deep understanding of security patterns across all languages and frameworks. CRITICAL RULES: 1. Parameterized queries (?, $1, %s, :param) = SAFE from SQL injection 2. textContent, createTextNode = SAFE from XSS 3. React JSX {variable} = SAFE from XSS (React auto-escapes) 4. subprocess.run([list, args]) without shell=True = SAFE from command injection 5. json.loads/JSON.parse = SAFE (cannot execute code) 6. secure_filename() from werkzeug = SAFE from path traversal 7. bcrypt/argon2/scrypt for password hashing = SAFE 8. HMAC.compare_digest/timingSafeEqual = SAFE from timing attacks 9. DOMPurify.sanitize() = SAFE from XSS 10. MD5/SHA1 for non-security purposes (checksums, cache keys) = SAFE 11. Test files testing security scanners = SAFE 12. Environment variables for secrets = SAFE (not hardcoded) 13. ORM methods (Django .filter(), Rails .where(hash), SQLAlchemy) = SAFE from SQLi 14. Content-Security-Policy, helmet(), CORS allowlists = SAFE ``` ## Integration with Kon Security Scanner This model is the default LLM for the [Kon Security Scanner](https://github.com/kon-security/kon), providing: - SAST finding validation and FP reduction - CWE ID mapping - Severity assessment - Remediation suggestions ```python from kon.core.ollama_analyzer import OllamaAnalyzer analyzer = OllamaAnalyzer(model="kon-security-v5:latest") result = analyzer.analyze_finding_enhanced( code_snippet="query = f'SELECT * FROM users WHERE id = {user_id}'", vulnerability_type="SQL Injection", file_path="app/db.py", line_number=42 ) print(result.verdict) # TRUE_POSITIVE ``` ## Training Details - **Method**: QLoRA (4-bit quantization-aware fine-tuning) - **Base**: Qwen2.5-Coder-7B-Instruct - **Dataset**: Curated security code review examples covering 20+ CWE categories - **Hardware**: NVIDIA GPU with CUDA support - **Quantization**: Q4_K_M via llama.cpp ## License Apache 2.0 (same as base model)