|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- security |
|
|
- exploit-generation |
|
|
- code-generation |
|
|
- cybersecurity |
|
|
- peft |
|
|
- lora |
|
|
base_model: codellama/CodeLlama-7b-hf |
|
|
--- |
|
|
|
|
|
# PoCSmith - AI-Powered Proof-of-Concept Generator |
|
|
|
|
|
Fine-tuned CodeLlama-7B model for generating security exploits and shellcode for defensive security research. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
PoCSmith is a LoRA-adapted CodeLlama-7B model trained on 1,472 CVE-exploit pairs and shellcode examples. It generates proof-of-concept exploits and multi-platform shellcode for authorized security testing. |
|
|
|
|
|
**Author:** Regaan |
|
|
**GitHub:** [noobforanonymous/PoCSmith](https://github.com/noobforanonymous/PoCSmith) |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Base Model:** CodeLlama-7B |
|
|
- **Method:** QLoRA 4-bit quantization |
|
|
- **Dataset:** 1,472 samples (CVE-Exploit pairs + shellcode) |
|
|
- **Training Time:** 3h 17min on RTX 4050 (6GB VRAM) |
|
|
- **Final Loss:** 0.84 (30% reduction) |
|
|
- **Token Accuracy:** 78.4% |
|
|
|
|
|
### Training Configuration |
|
|
|
|
|
```python |
|
|
- LoRA Rank: 64 |
|
|
- LoRA Alpha: 16 |
|
|
- Learning Rate: 2e-4 |
|
|
- Epochs: 3 |
|
|
- Quantization: 4-bit (nf4) |
|
|
- Batch Size: 1 (gradient accumulation x4) |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install torch transformers peft bitsandbytes accelerate |
|
|
``` |
|
|
|
|
|
### Loading the Model |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
# Load base model with 4-bit quantization |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"codellama/CodeLlama-7b-hf", |
|
|
load_in_4bit=True, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf") |
|
|
|
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained(base_model, "regaan/pocsmith") |
|
|
|
|
|
# Generate |
|
|
prompt = "Generate a reverse shell for Linux x64" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
### Using the Full Framework |
|
|
|
|
|
For a complete CLI tool with CVE parsing and shellcode generation: |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/noobforanonymous/PoCSmith.git |
|
|
cd PoCSmith |
|
|
pip install -e . |
|
|
|
|
|
# Generate exploit from CVE |
|
|
python src/cli/main.py cve CVE-2024-1234 |
|
|
|
|
|
# Generate shellcode |
|
|
python src/cli/main.py shellcode --platform linux_x64 --type reverse_shell |
|
|
``` |
|
|
|
|
|
## Capabilities |
|
|
|
|
|
- **CVE-based Exploit Generation:** Generate PoCs from CVE descriptions |
|
|
- **Multi-platform Shellcode:** x86, x64, ARM support |
|
|
- **Multiple Payload Types:** Reverse shells, bind shells, exec |
|
|
- **Clean Output:** Properly formatted code with comments |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Requires 6GB+ VRAM for inference |
|
|
- May generate non-working code for complex vulnerabilities |
|
|
- Should not be solely relied upon for production exploits |
|
|
- Requires manual review and testing |
|
|
|
|
|
## Ethical Use |
|
|
|
|
|
This model is designed exclusively for: |
|
|
- Authorized penetration testing |
|
|
- Security research |
|
|
- Educational purposes |
|
|
- CTF competitions |
|
|
|
|
|
**NOT for:** |
|
|
- Unauthorized system access |
|
|
- Malicious attacks |
|
|
- Illegal activities |
|
|
|
|
|
By using this model, you agree to: |
|
|
1. Only test systems you own or have written permission to test |
|
|
2. Follow responsible disclosure practices |
|
|
3. Comply with all applicable laws |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@software{pocsmith2024, |
|
|
author = {Regaan}, |
|
|
title = {PoCSmith: AI-Powered Proof-of-Concept Generator}, |
|
|
year = {2025}, |
|
|
url = {https://github.com/noobforanonymous/PoCSmith} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License - See [LICENSE](https://github.com/noobforanonymous/PoCSmith/blob/main/LICENSE) file |
|
|
|
|
|
--- |
|
|
|
|
|
**Version:** 1.0 |
|
|
**Model Size:** 343MB (LoRA adapters) |
|
|
**Base Model Size:** 13GB (CodeLlama-7B) |
|
|
|