--- base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct library_name: transformers model_name: security-auditor-grpo tags: - generated_from_trainer - grpo - trl - security - smart-contracts - solidity - audit - web3 license: apache-2.0 datasets: - oxdev/smart-contract-security-sft - oxdev/smart-contract-security-audit-v2 pipeline_tag: text-generation language: - en --- # 🔐 Smart Contract Security Auditor (GRPO) A specialized **smart contract security auditor** built on [Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct), fine-tuned using **Group Relative Policy Optimization (GRPO)** on real-world audit findings from top security firms. ## 🎯 What It Does Given a Solidity smart contract, this model identifies security vulnerabilities and produces structured audit findings with: - Vulnerability classification (reentrancy, access control, oracle manipulation, etc.) - Severity assessment (Critical/High/Medium/Low) - Detailed description of the vulnerability - Impact analysis - Proof of concept exploit code - Recommended fixes ## Quick Start ```python from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline model = AutoModelForCausalLM.from_pretrained( "oxdev/security-auditor-grpo", use_cache=True, # Important: config has use_cache=False from training ) tokenizer = AutoTokenizer.from_pretrained("oxdev/security-auditor-grpo") pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device="cuda") messages = [ {"role": "system", "content": "You are an expert smart contract security auditor. Analyze the provided Solidity code for vulnerabilities."}, {"role": "user", "content": """Audit this contract: ```solidity contract SimpleBank { mapping(address => uint256) public balances; function deposit() public payable { balances[msg.sender] += msg.value; } function withdraw(uint256 amount) public { require(balances[msg.sender] >= amount); (bool success, ) = msg.sender.call{value: amount}(""); require(success); balances[msg.sender] -= amount; } } ```"""}, ] result = pipe(messages, max_new_tokens=512, do_sample=False, return_full_text=False) output = result[0]["generated_text"] if isinstance(output, list): output = output[-1]["content"] print(output) ``` ## 🔗 Try It Live **Interactive Demo:** [oxdev/security-auditor-demo](https://huggingface.co/spaces/oxdev/security-auditor-demo) — Side-by-side comparison with base model, 7 test cases with known vulnerabilities, automated scoring. ## Training Details ### V1 (Current Model) - **Method:** GRPO (Group Relative Policy Optimization) - **Base Model:** Qwen2.5-Coder-0.5B-Instruct - **Dataset:** [oxdev/smart-contract-security-sft](https://huggingface.co/datasets/oxdev/smart-contract-security-sft) (327 synthetic samples) - **Hardware:** NVIDIA T4 (16GB) - **Epochs:** 2 - **Reward Functions:** Format compliance, finding rate - **Results:** - Format reward: 0.025 → 0.40 (**16× improvement**) - Finding rate: 0% → 50-75% - Mean reward: -0.34 → -0.006 ### V2 (Pending — Colab Notebook Ready) - **Dataset:** [oxdev/smart-contract-security-audit-v2](https://huggingface.co/datasets/oxdev/smart-contract-security-audit-v2) (50,902 real audit findings) - **Sources:** SkywardNomad92/smart-contract-audit-findings, samscrack/cyfrin-audit-findings, Solodit API - **4 Reward Functions:** Format (0.25), Severity matching (0.25), Category matching (0.25), Quality (0.25) - **Train on Colab:** Open [`train_grpo_v2_colab.ipynb`](https://huggingface.co/oxdev/security-auditor-grpo/blob/main/train_grpo_v2_colab.ipynb) in Google Colab with a free T4 GPU ## Vulnerability Categories Covered | Category | Keywords | |----------|----------| | Reentrancy | reentrancy, reentrant, callback | | Access Control | unauthorized, permission, onlyowner | | Oracle Manipulation | price feed, chainlink, twap | | Flash Loan | flash loan, flashloan | | Overflow/Underflow | overflow, underflow, arithmetic | | Front-running | front-run, sandwich, MEV | | DoS | denial of service, gas limit, unbounded | | Token Issues | fee-on-transfer, rebasing, ERC20 | | Storage | storage collision, delegatecall, proxy | | Cross-chain | bridge, relay, message passing | | Liquidation | liquidation, collateral, health factor | | Signature | ecrecover, replay, nonce, EIP712 | | Initialization | uninitialized, constructor | | Rounding | precision, truncation, decimal | ## Architecture - **Model:** Qwen2ForCausalLM - **Parameters:** 0.5B - **Hidden Size:** 896 - **Layers:** 24 - **Attention Heads:** 14 (2 KV heads) - **Context Length:** 32,768 tokens - **Chat Template:** ChatML (`<|im_start|>` / `<|im_end|>`) ## ⚠️ Important Notes 1. **Set `use_cache=True`** when loading for inference — the saved config has `use_cache=False` from training, which makes generation 10-20× slower 2. **This is a 0.5B model** — it's fast but not as capable as larger models. Use it for quick triage, not as a replacement for professional audits 3. **V1 was trained on 327 samples** — V2 training on 50K real findings will significantly improve quality ## Files | File | Description | |------|-------------| | `model.safetensors` | V1 trained model weights (1.8GB) | | `train_grpo_job.py` | V1 training script | | `train_grpo_v2.py` | V2 training script (4 reward functions) | | `train_grpo_v2_colab.ipynb` | V2 Colab notebook (free T4 GPU) | | `checkpoint-300/` | V1 training checkpoint | | `checkpoint-326/` | V1 final checkpoint | ## Related Resources - **GitHub:** [0xedev/skills](https://github.com/0xedev/skills) — Pashov Audit Group AI-powered security skills - **V2 Dataset:** [oxdev/smart-contract-security-audit-v2](https://huggingface.co/datasets/oxdev/smart-contract-security-audit-v2) - **Demo Space:** [oxdev/security-auditor-demo](https://huggingface.co/spaces/oxdev/security-auditor-demo) ## Framework Versions - TRL: 1.2.0 - Transformers: 5.6.2 - PyTorch: 2.6.0+cu126 - Datasets: 4.8.4 ## Citations ```bibtex @article{shao2024deepseekmath, title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}}, author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and others}, year = 2024, eprint = {arXiv:2402.03300}, } ```