File size: 1,750 Bytes
4e1a594 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3c015d5 3ce6cd4 8c5da4b 3ce6cd4 4e1a594 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 8c5da4b 3ce6cd4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | ---
title: SENTINEL Autonomous Pentesting Agent
emoji: 🛡️
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 4.36.1
app_file: app.py
pinned: false
python_version: 3.10.13
license: apache-2.0
short_description: Fine-tuned Llama-3-8B that autonomously exploits web vulns
tags:
- security
- llama-3
- autonomous-agent
- web-pentesting
- sql-injection
- cybersecurity
---
# 🛡️ SENTINEL — Autonomous Web Pentesting Agent
**SENTINEL** is a fine-tuned **Llama-3-8B-Instruct** model trained via SFT+GRPO to autonomously reason about web application vulnerabilities and generate exploit payloads.
## What it does
Given a **goal** (e.g. `AUTHENTICATED`, `DATA_EXFILTRATED`) and an **HTML snippet** (the current page DOM), SENTINEL outputs a single structured JSON action — exactly like a human pentester would decide their next move.
```json
{
"Thought": "Login form with username/password fields on a .php endpoint — classic SQLi target.",
"Action": "SQL_INJECT",
"Action_Input": {
"target_url": "http://target/login.php",
"method": "POST",
"parameters": {"username": "admin'--", "password": "x"},
"rationale": "OR-tautology bypass on username field"
}
}
```
## Model Details
- **Base model:** `meta-llama/Meta-Llama-3-8B-Instruct`
- **Fine-tuning:** SFT on curated web-exploit trajectories + GRPO reward shaping
- **Quantization:** Q5_K_M GGUF (~5.7 GB), served via `llama-cpp-python`
- **The GGUF weights** are hosted in a separate model repo and downloaded at runtime to bypass the Space 1 GB git limit.
> ⚠️ **Authorized testing only.** SENTINEL is designed for use against intentionally vulnerable targets (DVWA, Juice Shop, HackTheBox, etc.). Do not use against systems you do not own.
|