| --- |
| base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct |
| library_name: transformers |
| model_name: security-auditor-grpo |
| tags: |
| - generated_from_trainer |
| - grpo |
| - trl |
| - security |
| - smart-contracts |
| - solidity |
| - audit |
| - web3 |
| license: apache-2.0 |
| datasets: |
| - oxdev/smart-contract-security-sft |
| - oxdev/smart-contract-security-audit-v2 |
| pipeline_tag: text-generation |
| language: |
| - en |
| --- |
| |
| # π Smart Contract Security Auditor (GRPO) |
|
|
| A specialized **smart contract security auditor** built on [Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct), fine-tuned using **Group Relative Policy Optimization (GRPO)** on real-world audit findings from top security firms. |
|
|
| ## π― What It Does |
|
|
| Given a Solidity smart contract, this model identifies security vulnerabilities and produces structured audit findings with: |
| - Vulnerability classification (reentrancy, access control, oracle manipulation, etc.) |
| - Severity assessment (Critical/High/Medium/Low) |
| - Detailed description of the vulnerability |
| - Impact analysis |
| - Proof of concept exploit code |
| - Recommended fixes |
|
|
| ## Quick Start |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "oxdev/security-auditor-grpo", |
| use_cache=True, # Important: config has use_cache=False from training |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("oxdev/security-auditor-grpo") |
| pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device="cuda") |
| |
| messages = [ |
| {"role": "system", "content": "You are an expert smart contract security auditor. Analyze the provided Solidity code for vulnerabilities."}, |
| {"role": "user", "content": """Audit this contract: |
| ```solidity |
| contract SimpleBank { |
| mapping(address => uint256) public balances; |
| function deposit() public payable { balances[msg.sender] += msg.value; } |
| function withdraw(uint256 amount) public { |
| require(balances[msg.sender] >= amount); |
| (bool success, ) = msg.sender.call{value: amount}(""); |
| require(success); |
| balances[msg.sender] -= amount; |
| } |
| } |
| ```"""}, |
| ] |
| |
| result = pipe(messages, max_new_tokens=512, do_sample=False, return_full_text=False) |
| output = result[0]["generated_text"] |
| if isinstance(output, list): |
| output = output[-1]["content"] |
| print(output) |
| ``` |
| |
| ## π Try It Live |
| |
| **Interactive Demo:** [oxdev/security-auditor-demo](https://huggingface.co/spaces/oxdev/security-auditor-demo) β Side-by-side comparison with base model, 7 test cases with known vulnerabilities, automated scoring. |
| |
| ## Training Details |
| |
| ### V1 (Current Model) |
| - **Method:** GRPO (Group Relative Policy Optimization) |
| - **Base Model:** Qwen2.5-Coder-0.5B-Instruct |
| - **Dataset:** [oxdev/smart-contract-security-sft](https://huggingface.co/datasets/oxdev/smart-contract-security-sft) (327 synthetic samples) |
| - **Hardware:** NVIDIA T4 (16GB) |
| - **Epochs:** 2 |
| - **Reward Functions:** Format compliance, finding rate |
| - **Results:** |
| - Format reward: 0.025 β 0.40 (**16Γ improvement**) |
| - Finding rate: 0% β 50-75% |
| - Mean reward: -0.34 β -0.006 |
| |
| ### V2 (Pending β Colab Notebook Ready) |
| - **Dataset:** [oxdev/smart-contract-security-audit-v2](https://huggingface.co/datasets/oxdev/smart-contract-security-audit-v2) (50,902 real audit findings) |
| - **Sources:** SkywardNomad92/smart-contract-audit-findings, samscrack/cyfrin-audit-findings, Solodit API |
| - **4 Reward Functions:** Format (0.25), Severity matching (0.25), Category matching (0.25), Quality (0.25) |
| - **Train on Colab:** Open [`train_grpo_v2_colab.ipynb`](https://huggingface.co/oxdev/security-auditor-grpo/blob/main/train_grpo_v2_colab.ipynb) in Google Colab with a free T4 GPU |
| |
| ## Vulnerability Categories Covered |
| |
| | Category | Keywords | |
| |----------|----------| |
| | Reentrancy | reentrancy, reentrant, callback | |
| | Access Control | unauthorized, permission, onlyowner | |
| | Oracle Manipulation | price feed, chainlink, twap | |
| | Flash Loan | flash loan, flashloan | |
| | Overflow/Underflow | overflow, underflow, arithmetic | |
| | Front-running | front-run, sandwich, MEV | |
| | DoS | denial of service, gas limit, unbounded | |
| | Token Issues | fee-on-transfer, rebasing, ERC20 | |
| | Storage | storage collision, delegatecall, proxy | |
| | Cross-chain | bridge, relay, message passing | |
| | Liquidation | liquidation, collateral, health factor | |
| | Signature | ecrecover, replay, nonce, EIP712 | |
| | Initialization | uninitialized, constructor | |
| | Rounding | precision, truncation, decimal | |
| |
| ## Architecture |
| |
| - **Model:** Qwen2ForCausalLM |
| - **Parameters:** 0.5B |
| - **Hidden Size:** 896 |
| - **Layers:** 24 |
| - **Attention Heads:** 14 (2 KV heads) |
| - **Context Length:** 32,768 tokens |
| - **Chat Template:** ChatML (`<|im_start|>` / `<|im_end|>`) |
| |
| ## β οΈ Important Notes |
| |
| 1. **Set `use_cache=True`** when loading for inference β the saved config has `use_cache=False` from training, which makes generation 10-20Γ slower |
| 2. **This is a 0.5B model** β it's fast but not as capable as larger models. Use it for quick triage, not as a replacement for professional audits |
| 3. **V1 was trained on 327 samples** β V2 training on 50K real findings will significantly improve quality |
|
|
| ## Files |
|
|
| | File | Description | |
| |------|-------------| |
| | `model.safetensors` | V1 trained model weights (1.8GB) | |
| | `train_grpo_job.py` | V1 training script | |
| | `train_grpo_v2.py` | V2 training script (4 reward functions) | |
| | `train_grpo_v2_colab.ipynb` | V2 Colab notebook (free T4 GPU) | |
| | `checkpoint-300/` | V1 training checkpoint | |
| | `checkpoint-326/` | V1 final checkpoint | |
|
|
| ## Related Resources |
|
|
| - **GitHub:** [0xedev/skills](https://github.com/0xedev/skills) β Pashov Audit Group AI-powered security skills |
| - **V2 Dataset:** [oxdev/smart-contract-security-audit-v2](https://huggingface.co/datasets/oxdev/smart-contract-security-audit-v2) |
| - **Demo Space:** [oxdev/security-auditor-demo](https://huggingface.co/spaces/oxdev/security-auditor-demo) |
|
|
| ## Framework Versions |
|
|
| - TRL: 1.2.0 |
| - Transformers: 5.6.2 |
| - PyTorch: 2.6.0+cu126 |
| - Datasets: 4.8.4 |
|
|
| ## Citations |
|
|
| ```bibtex |
| @article{shao2024deepseekmath, |
| title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}}, |
| author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and others}, |
| year = 2024, |
| eprint = {arXiv:2402.03300}, |
| } |
| ``` |
|
|