File size: 3,594 Bytes
2f34ec6 70d2ef8 2f34ec6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | ---
license: mit
language:
- en
tags:
- security
- exploit-generation
- code-generation
- cybersecurity
- peft
- lora
base_model: codellama/CodeLlama-7b-hf
---
# PoCSmith - AI-Powered Proof-of-Concept Generator
Fine-tuned CodeLlama-7B model for generating security exploits and shellcode for defensive security research.
## Model Description
PoCSmith is a LoRA-adapted CodeLlama-7B model trained on 1,472 CVE-exploit pairs and shellcode examples. It generates proof-of-concept exploits and multi-platform shellcode for authorized security testing.
**Author:** Regaan
**GitHub:** [noobforanonymous/PoCSmith](https://github.com/noobforanonymous/PoCSmith)
## Training Details
- **Base Model:** CodeLlama-7B
- **Method:** QLoRA 4-bit quantization
- **Dataset:** 1,472 samples (CVE-Exploit pairs + shellcode)
- **Training Time:** 3h 17min on RTX 4050 (6GB VRAM)
- **Final Loss:** 0.84 (30% reduction)
- **Token Accuracy:** 78.4%
### Training Configuration
```python
- LoRA Rank: 64
- LoRA Alpha: 16
- Learning Rate: 2e-4
- Epochs: 3
- Quantization: 4-bit (nf4)
- Batch Size: 1 (gradient accumulation x4)
```
## Usage
### Installation
```bash
pip install torch transformers peft bitsandbytes accelerate
```
### Loading the Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model with 4-bit quantization
base_model = AutoModelForCausalLM.from_pretrained(
"codellama/CodeLlama-7b-hf",
load_in_4bit=True,
device_map="auto"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "regaan/pocsmith")
# Generate
prompt = "Generate a reverse shell for Linux x64"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))
```
### Using the Full Framework
For a complete CLI tool with CVE parsing and shellcode generation:
```bash
git clone https://github.com/noobforanonymous/PoCSmith.git
cd PoCSmith
pip install -e .
# Generate exploit from CVE
python src/cli/main.py cve CVE-2024-1234
# Generate shellcode
python src/cli/main.py shellcode --platform linux_x64 --type reverse_shell
```
## Capabilities
- **CVE-based Exploit Generation:** Generate PoCs from CVE descriptions
- **Multi-platform Shellcode:** x86, x64, ARM support
- **Multiple Payload Types:** Reverse shells, bind shells, exec
- **Clean Output:** Properly formatted code with comments
## Limitations
- Requires 6GB+ VRAM for inference
- May generate non-working code for complex vulnerabilities
- Should not be solely relied upon for production exploits
- Requires manual review and testing
## Ethical Use
This model is designed exclusively for:
- Authorized penetration testing
- Security research
- Educational purposes
- CTF competitions
**NOT for:**
- Unauthorized system access
- Malicious attacks
- Illegal activities
By using this model, you agree to:
1. Only test systems you own or have written permission to test
2. Follow responsible disclosure practices
3. Comply with all applicable laws
## Citation
```bibtex
@software{pocsmith2024,
author = {Regaan},
title = {PoCSmith: AI-Powered Proof-of-Concept Generator},
year = {2025},
url = {https://github.com/noobforanonymous/PoCSmith}
}
```
## License
MIT License - See [LICENSE](https://github.com/noobforanonymous/PoCSmith/blob/main/LICENSE) file
---
**Version:** 1.0
**Model Size:** 343MB (LoRA adapters)
**Base Model Size:** 13GB (CodeLlama-7B)
|