moro72842
/

CyberCoder-7B-v1

Model card Files Files and versions

moro72842 commited on Apr 23

Commit

e30e2da

·

verified ·

1 Parent(s): 35d7ba8

Upload README.md

Files changed (1) hide show

README.md +86 -0

README.md ADDED Viewed

	@@ -0,0 +1,86 @@

+# CyberCoder-7B-v1 🛡️
+A cybersecurity-focused code model fine-tuned from [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) for:
+- **CVE vulnerability analysis** with structured JSON output
+- **AST-based code security review**
+- **GDB crash trace analysis** and exploitability assessment
+- **ROP chain construction** and binary exploitation
+- **MITRE ATT&CK mapping** and threat intelligence
+- **Code reasoning** with chain-of-thought
+## Training Recipe
+Based on [CyberPal 2.0](https://arxiv.org/abs/2510.14113) methodology:
+| Parameter | Value |
+|-----------|-------|
+| Base model | Qwen/Qwen2.5-Coder-7B-Instruct |
+| Method | SFT with LoRA (r=64, α=128) |
+| Learning rate | 4e-5 |
+| Warmup ratio | 0.15 |
+| Epochs | 2 |
+| Max seq length | 4096 |
+| Optimizer | AdamW + cosine schedule |
+| Dataset | moro72842/cybersecurity-sft-dataset (20K examples) |
+## Dataset Composition
+| Source | Count | Description |
+|--------|-------|-------------|
+| CVE Records | 10,000 | Multi-turn CVE analysis from 297K records |
+| Code Feedback | 5,000 | Code reasoning with iterative refinement |
+| OpenCodeReasoning | 5,000 | Chain-of-thought code problem solving |
+| Synthetic Security | 8 | JSON-structured CVE, AST, GDB, ROP examples |
+## Capabilities
+### JSON Structured Output
+Trained on examples that require structured JSON output with `<reasoning>` blocks followed by JSON. Pattern:
+```
+<reasoning>
+Step-by-step analysis...
+</reasoning>
+```json
+{...structured output...}
+```
+```
+### Cybersecurity Domains
+- Vulnerability analysis (CVE/CWE)
+- Static code analysis with AST parsing
+- Binary exploitation (ROP chains, buffer overflows)
+- Crash dump / GDB trace analysis
+- Threat intelligence (MITRE ATT&CK mapping)
+- Malware behavior classification
+- Network intrusion detection
+## Usage
+```python
+from transformers import pipeline
+pipe = pipeline("text-generation", model="moro72842/CyberCoder-7B-v1", torch_dtype="auto", device_map="auto")
+messages = [
+    {"role": "system", "content": "You are a cybersecurity expert. Provide detailed analysis with structured JSON output."},
+    {"role": "user", "content": "Analyze CVE-2021-44228 and provide the analysis as JSON."}
+]
+response = pipe(messages, max_new_tokens=2048, temperature=0.1)
+print(response[0]["generated_text"][-1]["content"])
+```
+## Architecture & Efficiency Considerations
+This model demonstrates the approach described in the training documentation for building cybersecurity-capable models:
+- **MoE consideration**: For production 100B+ models, sparse MoE (DeepSeek-V3 style) with 64-128 experts reduces active params to ~37B
+- **MLA attention**: Multi-Head Latent Attention compresses KV cache for long-context inference
+- **LoRA efficiency**: This 7B model uses LoRA (r=64), training only ~2% of parameters while achieving strong domain performance
+- **Structured output**: JSON structured output trained via SFT examples rather than constrained decoding (per RL-Struct findings)
+## License
+Apache 2.0 (inherited from Qwen2.5-Coder)