File size: 6,328 Bytes
044e65a
93b6a9a
044e65a
df8a81e
044e65a
93b6a9a
044e65a
93b6a9a
df8a81e
 
 
 
 
 
 
 
 
 
 
 
044e65a
 
df8a81e
044e65a
df8a81e
044e65a
df8a81e
044e65a
df8a81e
 
 
 
 
 
 
044e65a
df8a81e
044e65a
df8a81e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
044e65a
df8a81e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
044e65a
93b6a9a
 
df8a81e
93b6a9a
044e65a
93b6a9a
044e65a
93b6a9a
 
df8a81e
 
 
 
93b6a9a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
---
base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct
library_name: transformers
model_name: security-auditor-grpo
tags:
- generated_from_trainer
- grpo
- trl
- security
- smart-contracts
- solidity
- audit
- web3
license: apache-2.0
datasets:
- oxdev/smart-contract-security-sft
- oxdev/smart-contract-security-audit-v2
pipeline_tag: text-generation
language:
- en
---

# πŸ” Smart Contract Security Auditor (GRPO)

A specialized **smart contract security auditor** built on [Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct), fine-tuned using **Group Relative Policy Optimization (GRPO)** on real-world audit findings from top security firms.

## 🎯 What It Does

Given a Solidity smart contract, this model identifies security vulnerabilities and produces structured audit findings with:
- Vulnerability classification (reentrancy, access control, oracle manipulation, etc.)
- Severity assessment (Critical/High/Medium/Low)
- Detailed description of the vulnerability
- Impact analysis
- Proof of concept exploit code
- Recommended fixes

## Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model = AutoModelForCausalLM.from_pretrained(
    "oxdev/security-auditor-grpo",
    use_cache=True,  # Important: config has use_cache=False from training
)
tokenizer = AutoTokenizer.from_pretrained("oxdev/security-auditor-grpo")
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device="cuda")

messages = [
    {"role": "system", "content": "You are an expert smart contract security auditor. Analyze the provided Solidity code for vulnerabilities."},
    {"role": "user", "content": """Audit this contract:
```solidity
contract SimpleBank {
    mapping(address => uint256) public balances;
    function deposit() public payable { balances[msg.sender] += msg.value; }
    function withdraw(uint256 amount) public {
        require(balances[msg.sender] >= amount);
        (bool success, ) = msg.sender.call{value: amount}("");
        require(success);
        balances[msg.sender] -= amount;
    }
}
```"""},
]

result = pipe(messages, max_new_tokens=512, do_sample=False, return_full_text=False)
output = result[0]["generated_text"]
if isinstance(output, list):
    output = output[-1]["content"]
print(output)
```

## πŸ”— Try It Live

**Interactive Demo:** [oxdev/security-auditor-demo](https://huggingface.co/spaces/oxdev/security-auditor-demo) β€” Side-by-side comparison with base model, 7 test cases with known vulnerabilities, automated scoring.

## Training Details

### V1 (Current Model)
- **Method:** GRPO (Group Relative Policy Optimization)
- **Base Model:** Qwen2.5-Coder-0.5B-Instruct
- **Dataset:** [oxdev/smart-contract-security-sft](https://huggingface.co/datasets/oxdev/smart-contract-security-sft) (327 synthetic samples)
- **Hardware:** NVIDIA T4 (16GB)
- **Epochs:** 2
- **Reward Functions:** Format compliance, finding rate
- **Results:**
  - Format reward: 0.025 β†’ 0.40 (**16Γ— improvement**)
  - Finding rate: 0% β†’ 50-75%
  - Mean reward: -0.34 β†’ -0.006

### V2 (Pending β€” Colab Notebook Ready)
- **Dataset:** [oxdev/smart-contract-security-audit-v2](https://huggingface.co/datasets/oxdev/smart-contract-security-audit-v2) (50,902 real audit findings)
- **Sources:** SkywardNomad92/smart-contract-audit-findings, samscrack/cyfrin-audit-findings, Solodit API
- **4 Reward Functions:** Format (0.25), Severity matching (0.25), Category matching (0.25), Quality (0.25)
- **Train on Colab:** Open [`train_grpo_v2_colab.ipynb`](https://huggingface.co/oxdev/security-auditor-grpo/blob/main/train_grpo_v2_colab.ipynb) in Google Colab with a free T4 GPU

## Vulnerability Categories Covered

| Category | Keywords |
|----------|----------|
| Reentrancy | reentrancy, reentrant, callback |
| Access Control | unauthorized, permission, onlyowner |
| Oracle Manipulation | price feed, chainlink, twap |
| Flash Loan | flash loan, flashloan |
| Overflow/Underflow | overflow, underflow, arithmetic |
| Front-running | front-run, sandwich, MEV |
| DoS | denial of service, gas limit, unbounded |
| Token Issues | fee-on-transfer, rebasing, ERC20 |
| Storage | storage collision, delegatecall, proxy |
| Cross-chain | bridge, relay, message passing |
| Liquidation | liquidation, collateral, health factor |
| Signature | ecrecover, replay, nonce, EIP712 |
| Initialization | uninitialized, constructor |
| Rounding | precision, truncation, decimal |

## Architecture

- **Model:** Qwen2ForCausalLM
- **Parameters:** 0.5B
- **Hidden Size:** 896
- **Layers:** 24
- **Attention Heads:** 14 (2 KV heads)
- **Context Length:** 32,768 tokens
- **Chat Template:** ChatML (`<|im_start|>` / `<|im_end|>`)

## ⚠️ Important Notes

1. **Set `use_cache=True`** when loading for inference β€” the saved config has `use_cache=False` from training, which makes generation 10-20Γ— slower
2. **This is a 0.5B model** β€” it's fast but not as capable as larger models. Use it for quick triage, not as a replacement for professional audits
3. **V1 was trained on 327 samples** β€” V2 training on 50K real findings will significantly improve quality

## Files

| File | Description |
|------|-------------|
| `model.safetensors` | V1 trained model weights (1.8GB) |
| `train_grpo_job.py` | V1 training script |
| `train_grpo_v2.py` | V2 training script (4 reward functions) |
| `train_grpo_v2_colab.ipynb` | V2 Colab notebook (free T4 GPU) |
| `checkpoint-300/` | V1 training checkpoint |
| `checkpoint-326/` | V1 final checkpoint |

## Related Resources

- **GitHub:** [0xedev/skills](https://github.com/0xedev/skills) β€” Pashov Audit Group AI-powered security skills
- **V2 Dataset:** [oxdev/smart-contract-security-audit-v2](https://huggingface.co/datasets/oxdev/smart-contract-security-audit-v2)
- **Demo Space:** [oxdev/security-auditor-demo](https://huggingface.co/spaces/oxdev/security-auditor-demo)

## Framework Versions

- TRL: 1.2.0
- Transformers: 5.6.2
- PyTorch: 2.6.0+cu126
- Datasets: 4.8.4

## Citations

```bibtex
@article{shao2024deepseekmath,
    title   = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
    author  = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and others},
    year    = 2024,
    eprint  = {arXiv:2402.03300},
}
```