piccolo_gorgone / README.md
CorryL's picture
Update README.md
ade9118 verified
---
language:
- en
- it
tags:
- cybersecurity
- red-team
- ctf
- penetration-testing
- offensive-security
- uncensored
- llama-cpp
- gguf
base_model: Qwen/Qwen3.5-9B
license: apache-2.0
---
<p align="center">
<img src="https://huggingface.co/CorryL/piccolo_gorgone/resolve/main/banner.PNG" alt="Piccolo Gorgone Banner"/>
</p>
# πŸ™ Piccolo Gorgone
![Base Model](https://img.shields.io/badge/Base-Qwen3.5--9B-blue?style=flat-square)
![Fine-tuning](https://img.shields.io/badge/Fine--tuned-Unsloth%20%2B%20QLoRA-orange?style=flat-square)
![Domain](https://img.shields.io/badge/Domain-Offensive%20Security-red?style=flat-square)
![Censorship](https://img.shields.io/badge/Censorship-None-black?style=flat-square)
![Format](https://img.shields.io/badge/Format-GGUF%20Q4__K__M-green?style=flat-square)
<p align="center">
Developed by <strong>CorryL</strong> β€” Penetration Tester & Ethical Hacker
</p>
---
**Piccolo Gorgone** is a Large Language Model fine-tuned for **red team operations**, **CTF competitions**, and **offensive cybersecurity**. Built on Qwen 3.5 9B and trained on a curated dataset of over **16,000 real-world offensive security examples**, it delivers technically precise and direct responses without the safety restrictions that limit general-purpose models. Piccolo Gorgone is fully **agentic** and natively integrates with the most widely used offensive security frameworks, enabling automated and orchestrated workflows directly from your existing toolchain.
---
## Local Execution & Privacy
Piccolo Gorgone was designed from the ground up to run on **local consumer hardware**, with no dependency on cloud APIs or external services. The choice of a 9B parameter model is deliberate: it represents the optimal balance between **technical capability** and **accessible hardware requirements**, enabling execution on a single consumer GPU with Q4_K_M quantization.
This approach ensures that all sensitive information β€” penetration test reports, vulnerability details, client data β€” stays **exclusively on your machine**, never transiting through third-party servers.
---
## Intended Use
This model is designed for:
- Professional **penetration testers** and red teamers operating in authorized environments
- **CTF competitors** (HackTheBox, CTFtime, and similar platforms)
- Offensive security **researchers and instructors**
- Security teams performing **threat modeling and attack simulation**
> ⚠️ **Disclaimer:** This model is intended exclusively for ethical and professional use in authorized environments. The author bears no responsibility for illegal or unauthorized use.
---
## Agentic Integration
Piccolo Gorgone supports **agentic workflows** and is designed to operate as an autonomous reasoning engine within offensive security pipelines. It is compatible with the following frameworks and tools:
| Framework | Use Case |
|-----------|----------|
| **CAI (Cybersecurity AI)** | Autonomous red team agents and attack orchestration |
| **Roo Code** | AI-assisted code generation and vulnerability research |
| **LangChain / LlamaIndex** | Custom agentic pipelines and tool-calling workflows |
| **OpenAI-compatible APIs** | Drop-in integration via llama-server OpenAI-compatible endpoint |
> Since llama-server exposes an **OpenAI-compatible REST API**, Piccolo Gorgone can be used as a local drop-in replacement for any framework that supports custom endpoints β€” no code changes required.
---
## Model Details
| Property | Value |
|----------|-------|
| **Base Model** | Qwen 3.5 9B |
| **Fine-tuning Method** | QLoRA via Unsloth |
| **Format** | GGUF (Q4_K_M) |
| **Context Length** | 128,000 tokens |
---
## Training Dataset
The model was trained on a dataset of **16,272 examples** assembled from the following categories:
| Category | Description |
|----------|-------------|
| πŸ“– **Offensive Knowledge Bases** | Technical guides and offensive techniques from authoritative open sources |
| 🏴 **CTF Writeups & Solutions** | Real competition writeups and walkthroughs from platforms and academic datasets |
| πŸ”΄ **Red Team TTPs** | Tactics, Techniques, and Procedures aligned with adversarial frameworks |
| πŸ—‘οΈ **Exploits & Payloads** | Real-world payloads, shellcode, and proof-of-concept exploits |
| πŸ› **CVE Database (up to 2025)** | Comprehensive vulnerability data including the most recent 2025 CVEs |
| πŸ”¬ **Research Papers** | Academic papers on offensive security and adversarial techniques |
> The dataset underwent rigorous deduplication to ensure training quality and stability.
---
## Benchmark
> πŸ“Š Comparative benchmark between **Qwen 3.5 9B (base)** and **Piccolo Gorgone** on offensive security tasks.
**Qwen 3.5 9B 8.3%**
<p align="left">
<img src="https://huggingface.co/CorryL/piccolo_gorgone/resolve/main/BM_Qween.png"/>
</p>
**Piccolo Gorgone 77.1%**
<p align="left">
<img src="https://huggingface.co/CorryL/piccolo_gorgone/resolve/main/BM_PG.png"/>
</p>
---
## Inference
### llama-server (recommended)
```bash
llama-server \
-m Qwen3.5-9B_Piccolo_Gorgone.Q4_K_M.gguf \
--host 0.0.0.0 \
--port 8081 \
-ngl 99 \
-c 32768 \
-fa on \
--cache-reuse 256 \
-ctk q8_0 \
-ctv q8_0 \
-b 512 -ub 512 \
--temp 1.0 \
--top-p 0.95 \
--top-k 20 \
--min-p 0.0 \
--presence-penalty 1.5 \
--repeat-penalty 1.0 \
--repeat-last-n 64 \
--chat-template-kwargs '{"enable_thinking":false}'
```
> The `-c 32768` value defines the active context window. You can increase it up to `131072` to leverage the model's full context, or reduce it based on the available VRAM on your machine. A larger context requires more memory but enables longer conversations and deeper analysis sessions.
> `--chat-template-kwargs '{"enable_thinking":false}'` disables Qwen3.5's internal chain-of-thought reasoning, producing faster and more direct responses β€” ideal for operational use.
### Inference Parameters
| Parameter | Value | Notes |
|-----------|-------|-------|
| `--temp` | `1.0` | Creativity/coherence balance |
| `--top-p` | `0.95` | Nucleus sampling |
| `--top-k` | `20` | Vocabulary filtering |
| `--min-p` | `0.0` | Minimum probability threshold |
| `--presence-penalty` | `1.5` | Reduces topic repetition |
| `--repeat-last-n` | `64` | Repetition penalty window |
| `-ngl` | `99` | Full GPU offload |
| `-c` | `32768` | Context window (adjustable) |
> **Tip:** For analytical tasks such as CVE analysis or code review, lower `--temp` to `0.4–0.6` for more deterministic output.
---
## Author
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-0077B5?style=flat-square&logo=linkedin)](https://www.linkedin.com/in/corrado-liotta-6111a821/)
[![Hugging Face](https://img.shields.io/badge/HuggingFace-Profile-yellow?style=flat-square)](https://huggingface.co/CorryL)