YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

CyberCoder-7B-v1 🛡️

A cybersecurity-focused code model fine-tuned from Qwen/Qwen2.5-Coder-7B-Instruct for:

  • CVE vulnerability analysis with structured JSON output
  • AST-based code security review
  • GDB crash trace analysis and exploitability assessment
  • ROP chain construction and binary exploitation
  • MITRE ATT&CK mapping and threat intelligence
  • Code reasoning with chain-of-thought

Training Recipe

Based on CyberPal 2.0 methodology:

Parameter Value
Base model Qwen/Qwen2.5-Coder-7B-Instruct
Method SFT with LoRA (r=64, α=128)
Learning rate 4e-5
Warmup ratio 0.15
Epochs 2
Max seq length 4096
Optimizer AdamW + cosine schedule
Dataset moro72842/cybersecurity-sft-dataset (20K examples)

Dataset Composition

Source Count Description
CVE Records 10,000 Multi-turn CVE analysis from 297K records
Code Feedback 5,000 Code reasoning with iterative refinement
OpenCodeReasoning 5,000 Chain-of-thought code problem solving
Synthetic Security 8 JSON-structured CVE, AST, GDB, ROP examples

Capabilities

JSON Structured Output

Trained on examples that require structured JSON output with <reasoning> blocks followed by JSON. Pattern:

<reasoning>
Step-by-step analysis...
</reasoning>

```json
{...structured output...}

### Cybersecurity Domains
- Vulnerability analysis (CVE/CWE)
- Static code analysis with AST parsing
- Binary exploitation (ROP chains, buffer overflows)
- Crash dump / GDB trace analysis
- Threat intelligence (MITRE ATT&CK mapping)
- Malware behavior classification
- Network intrusion detection

## Usage

```python
from transformers import pipeline

pipe = pipeline("text-generation", model="moro72842/CyberCoder-7B-v1", torch_dtype="auto", device_map="auto")

messages = [
    {"role": "system", "content": "You are a cybersecurity expert. Provide detailed analysis with structured JSON output."},
    {"role": "user", "content": "Analyze CVE-2021-44228 and provide the analysis as JSON."}
]

response = pipe(messages, max_new_tokens=2048, temperature=0.1)
print(response[0]["generated_text"][-1]["content"])

Architecture & Efficiency Considerations

This model demonstrates the approach described in the training documentation for building cybersecurity-capable models:

  • MoE consideration: For production 100B+ models, sparse MoE (DeepSeek-V3 style) with 64-128 experts reduces active params to ~37B
  • MLA attention: Multi-Head Latent Attention compresses KV cache for long-context inference
  • LoRA efficiency: This 7B model uses LoRA (r=64), training only ~2% of parameters while achieving strong domain performance
  • Structured output: JSON structured output trained via SFT examples rather than constrained decoding (per RL-Struct findings)

License

Apache 2.0 (inherited from Qwen2.5-Coder)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for moro72842/CyberCoder-7B-v1