kon-security-v5 / README.md

Upload folder using huggingface_hub

1ac2d5e verified 1 day ago

5.29 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-Coder-7B-Instruct
	tags:
	- security
	- code-review
	- vulnerability-detection
	- sast
	- false-positive-reduction
	- gguf
	- qwen2
	- ollama
	language:
	- en
	pipeline_tag: text-generation
	model-index:
	- name: kon-security-v5
	results:
	- task:
	type: text-generation
	name: Security Code Review
	metrics:
	- name: Accuracy
	type: accuracy
	value: 98.1
	- name: F1 Score
	type: f1
	value: 0.99
	- name: False Positive Rate
	type: custom
	value: 0.0
	- name: JSON Compliance
	type: custom
	value: 100.0
	---

	# kon-security-v5

	Expert Security Code Reviewer - A fine-tuned Qwen2.5-Coder-7B model specialized for security vulnerability detection and false positive reduction in SAST (Static Application Security Testing) pipelines.

	## Model Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Base Model \| [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) \|
	\| Fine-tuning \| QLoRA (4-bit quantization) \|
	\| Quantization \| Q4_K_M (GGUF) \|
	\| Parameters \| 7.6B \|
	\| Context Length \| 32,768 tokens \|
	\| File Size \| ~4.7 GB \|
	\| Format \| GGUF (Ollama-compatible) \|

	## Performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Overall Accuracy \| 98.1% \|
	\| F1 Score \| 0.99 \|
	\| False Positive Rate \| 0.0% \|
	\| JSON Compliance \| 100% \|
	\| Avg Response Time \| 2.8s \|

	## Capabilities

	- Identifies true security vulnerabilities across 20+ vulnerability categories
	- Eliminates false positives from SAST tools (SQL injection, XSS, command injection, etc.)
	- Provides structured JSON output with verdict, confidence, CWE IDs, severity, and remediation
	- Understands framework-specific safe patterns (React, Django, Express, Rails, etc.)
	- Supports taint analysis reasoning (source-to-sink tracking)

	## Vulnerability Categories

	CRITICAL: SQL Injection (CWE-89), Command Injection (CWE-78), Deserialization (CWE-502), Hardcoded Secrets (CWE-798), Code Injection (CWE-94)

	HIGH: XSS (CWE-79), Path Traversal (CWE-22), SSRF (CWE-918), Timing Attacks (CWE-208), Buffer Overflow (CWE-120)

	MEDIUM: Weak Crypto (CWE-327), Insecure Random (CWE-330), Information Disclosure (CWE-200), Missing Auth (CWE-306)

	## Usage with Ollama

	```bash
	# Pull the model
	ollama pull kon-security/kon-security-v5

	# Or create from GGUF
	ollama create kon-security-v5 -f Modelfile

	# Run
	ollama run kon-security-v5
	```

	### Example Prompt

	```
	<\|im_start\|>system
	You are an expert security code reviewer...
	<\|im_end\|>
	<\|im_start\|>user
	Analyze this code for SQL injection:
	query = f"SELECT * FROM users WHERE id = {user_id}"
	<\|im_end\|>
	<\|im_start\|>assistant
	```

	### Example Response

	```json
	{
	"verdict": "TRUE_POSITIVE",
	"is_vulnerable": true,
	"confidence": 0.97,
	"cwe_ids": ["CWE-89"],
	"severity": "CRITICAL",
	"reasoning": "f-string interpolates user_id directly into SQL query without parameterization",
	"remediation": "cursor.execute('SELECT * FROM users WHERE id = ?', (user_id,))"
	}
	```

	## System Prompt

	The model is fine-tuned with the following system prompt baked in:

	```
	You are an expert security code reviewer specializing in identifying true
	vulnerabilities and eliminating false positives. You analyze code with deep
	understanding of security patterns across all languages and frameworks.

	CRITICAL RULES:
	1. Parameterized queries (?, $1, %s, :param) = SAFE from SQL injection
	2. textContent, createTextNode = SAFE from XSS
	3. React JSX {variable} = SAFE from XSS (React auto-escapes)
	4. subprocess.run([list, args]) without shell=True = SAFE from command injection
	5. json.loads/JSON.parse = SAFE (cannot execute code)
	6. secure_filename() from werkzeug = SAFE from path traversal
	7. bcrypt/argon2/scrypt for password hashing = SAFE
	8. HMAC.compare_digest/timingSafeEqual = SAFE from timing attacks
	9. DOMPurify.sanitize() = SAFE from XSS
	10. MD5/SHA1 for non-security purposes (checksums, cache keys) = SAFE
	11. Test files testing security scanners = SAFE
	12. Environment variables for secrets = SAFE (not hardcoded)
	13. ORM methods (Django .filter(), Rails .where(hash), SQLAlchemy) = SAFE from SQLi
	14. Content-Security-Policy, helmet(), CORS allowlists = SAFE
	```

	## Integration with Kon Security Scanner

	This model is the default LLM for the [Kon Security Scanner](https://github.com/kon-security/kon), providing:

	- SAST finding validation and FP reduction
	- CWE ID mapping
	- Severity assessment
	- Remediation suggestions

	```python
	from kon.core.ollama_analyzer import OllamaAnalyzer

	analyzer = OllamaAnalyzer(model="kon-security-v5:latest")
	result = analyzer.analyze_finding_enhanced(
	code_snippet="query = f'SELECT * FROM users WHERE id = {user_id}'",
	vulnerability_type="SQL Injection",
	file_path="app/db.py",
	line_number=42
	)
	print(result.verdict) # TRUE_POSITIVE
	```

	## Training Details

	- Method: QLoRA (4-bit quantization-aware fine-tuning)
	- Base: Qwen2.5-Coder-7B-Instruct
	- Dataset: Curated security code review examples covering 20+ CWE categories
	- Hardware: NVIDIA GPU with CUDA support
	- Quantization: Q4_K_M via llama.cpp

	## License

	Apache 2.0 (same as base model)