moro72842
/

CyberCoder-7B-v1

Model card Files Files and versions

CyberCoder-7B-v1 / README.md

moro72842's picture

Upload README.md

e30e2da verified about 1 month ago

|

history blame contribute delete

3.06 kB

	# CyberCoder-7B-v1 🛡️

	A cybersecurity-focused code model fine-tuned from [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) for:

	- CVE vulnerability analysis with structured JSON output
	- AST-based code security review
	- GDB crash trace analysis and exploitability assessment
	- ROP chain construction and binary exploitation
	- MITRE ATT&CK mapping and threat intelligence
	- Code reasoning with chain-of-thought

	## Training Recipe

	Based on [CyberPal 2.0](https://arxiv.org/abs/2510.14113) methodology:

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base model \| Qwen/Qwen2.5-Coder-7B-Instruct \|
	\| Method \| SFT with LoRA (r=64, α=128) \|
	\| Learning rate \| 4e-5 \|
	\| Warmup ratio \| 0.15 \|
	\| Epochs \| 2 \|
	\| Max seq length \| 4096 \|
	\| Optimizer \| AdamW + cosine schedule \|
	\| Dataset \| moro72842/cybersecurity-sft-dataset (20K examples) \|

	## Dataset Composition

	\| Source \| Count \| Description \|
	\|--------\|-------\|-------------\|
	\| CVE Records \| 10,000 \| Multi-turn CVE analysis from 297K records \|
	\| Code Feedback \| 5,000 \| Code reasoning with iterative refinement \|
	\| OpenCodeReasoning \| 5,000 \| Chain-of-thought code problem solving \|
	\| Synthetic Security \| 8 \| JSON-structured CVE, AST, GDB, ROP examples \|

	## Capabilities

	### JSON Structured Output
	Trained on examples that require structured JSON output with `<reasoning>` blocks followed by JSON. Pattern:
	```
	<reasoning>
	Step-by-step analysis...
	</reasoning>

	```json
	{...structured output...}
	```
	```

	### Cybersecurity Domains
	- Vulnerability analysis (CVE/CWE)
	- Static code analysis with AST parsing
	- Binary exploitation (ROP chains, buffer overflows)
	- Crash dump / GDB trace analysis
	- Threat intelligence (MITRE ATT&CK mapping)
	- Malware behavior classification
	- Network intrusion detection

	## Usage

	```python
	from transformers import pipeline

	pipe = pipeline("text-generation", model="moro72842/CyberCoder-7B-v1", torch_dtype="auto", device_map="auto")

	messages = [
	{"role": "system", "content": "You are a cybersecurity expert. Provide detailed analysis with structured JSON output."},
	{"role": "user", "content": "Analyze CVE-2021-44228 and provide the analysis as JSON."}
	]

	response = pipe(messages, max_new_tokens=2048, temperature=0.1)
	print(response[0]["generated_text"][-1]["content"])
	```

	## Architecture & Efficiency Considerations

	This model demonstrates the approach described in the training documentation for building cybersecurity-capable models:

	- MoE consideration: For production 100B+ models, sparse MoE (DeepSeek-V3 style) with 64-128 experts reduces active params to ~37B
	- MLA attention: Multi-Head Latent Attention compresses KV cache for long-context inference
	- LoRA efficiency: This 7B model uses LoRA (r=64), training only ~2% of parameters while achieving strong domain performance
	- Structured output: JSON structured output trained via SFT examples rather than constrained decoding (per RL-Struct findings)

	## License

	Apache 2.0 (inherited from Qwen2.5-Coder)