| --- |
| language: |
| - en |
| - it |
| tags: |
| - cybersecurity |
| - red-team |
| - ctf |
| - penetration-testing |
| - offensive-security |
| - uncensored |
| - llama-cpp |
| - gguf |
| base_model: Qwen/Qwen3.5-9B |
| license: apache-2.0 |
| --- |
| |
| <p align="center"> |
| <img src="https://huggingface.co/CorryL/piccolo_gorgone/resolve/main/banner.PNG" alt="Piccolo Gorgone Banner"/> |
| </p> |
|
|
| # π Piccolo Gorgone |
|
|
|  |
|  |
|  |
|  |
|  |
|
|
| <p align="center"> |
| Developed by <strong>CorryL</strong> β Penetration Tester & Ethical Hacker |
| </p> |
|
|
| --- |
|
|
| **Piccolo Gorgone** is a Large Language Model fine-tuned for **red team operations**, **CTF competitions**, and **offensive cybersecurity**. Built on Qwen 3.5 9B and trained on a curated dataset of over **16,000 real-world offensive security examples**, it delivers technically precise and direct responses without the safety restrictions that limit general-purpose models. Piccolo Gorgone is fully **agentic** and natively integrates with the most widely used offensive security frameworks, enabling automated and orchestrated workflows directly from your existing toolchain. |
|
|
| --- |
|
|
| ## Local Execution & Privacy |
|
|
| Piccolo Gorgone was designed from the ground up to run on **local consumer hardware**, with no dependency on cloud APIs or external services. The choice of a 9B parameter model is deliberate: it represents the optimal balance between **technical capability** and **accessible hardware requirements**, enabling execution on a single consumer GPU with Q4_K_M quantization. |
|
|
| This approach ensures that all sensitive information β penetration test reports, vulnerability details, client data β stays **exclusively on your machine**, never transiting through third-party servers. |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| This model is designed for: |
|
|
| - Professional **penetration testers** and red teamers operating in authorized environments |
| - **CTF competitors** (HackTheBox, CTFtime, and similar platforms) |
| - Offensive security **researchers and instructors** |
| - Security teams performing **threat modeling and attack simulation** |
|
|
| > β οΈ **Disclaimer:** This model is intended exclusively for ethical and professional use in authorized environments. The author bears no responsibility for illegal or unauthorized use. |
|
|
| --- |
|
|
| ## Agentic Integration |
|
|
| Piccolo Gorgone supports **agentic workflows** and is designed to operate as an autonomous reasoning engine within offensive security pipelines. It is compatible with the following frameworks and tools: |
|
|
| | Framework | Use Case | |
| |-----------|----------| |
| | **CAI (Cybersecurity AI)** | Autonomous red team agents and attack orchestration | |
| | **Roo Code** | AI-assisted code generation and vulnerability research | |
| | **LangChain / LlamaIndex** | Custom agentic pipelines and tool-calling workflows | |
| | **OpenAI-compatible APIs** | Drop-in integration via llama-server OpenAI-compatible endpoint | |
|
|
| > Since llama-server exposes an **OpenAI-compatible REST API**, Piccolo Gorgone can be used as a local drop-in replacement for any framework that supports custom endpoints β no code changes required. |
|
|
| --- |
|
|
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |----------|-------| |
| | **Base Model** | Qwen 3.5 9B | |
| | **Fine-tuning Method** | QLoRA via Unsloth | |
| | **Format** | GGUF (Q4_K_M) | |
| | **Context Length** | 128,000 tokens | |
|
|
| --- |
|
|
| ## Training Dataset |
|
|
| The model was trained on a dataset of **16,272 examples** assembled from the following categories: |
|
|
| | Category | Description | |
| |----------|-------------| |
| | π **Offensive Knowledge Bases** | Technical guides and offensive techniques from authoritative open sources | |
| | π΄ **CTF Writeups & Solutions** | Real competition writeups and walkthroughs from platforms and academic datasets | |
| | π΄ **Red Team TTPs** | Tactics, Techniques, and Procedures aligned with adversarial frameworks | |
| | π‘οΈ **Exploits & Payloads** | Real-world payloads, shellcode, and proof-of-concept exploits | |
| | π **CVE Database (up to 2025)** | Comprehensive vulnerability data including the most recent 2025 CVEs | |
| | π¬ **Research Papers** | Academic papers on offensive security and adversarial techniques | |
|
|
| > The dataset underwent rigorous deduplication to ensure training quality and stability. |
|
|
| --- |
|
|
| ## Benchmark |
|
|
| > π Comparative benchmark between **Qwen 3.5 9B (base)** and **Piccolo Gorgone** on offensive security tasks. |
|
|
| **Qwen 3.5 9B 8.3%** |
| <p align="left"> |
| <img src="https://huggingface.co/CorryL/piccolo_gorgone/resolve/main/BM_Qween.png"/> |
| </p> |
|
|
| **Piccolo Gorgone 77.1%** |
| <p align="left"> |
| <img src="https://huggingface.co/CorryL/piccolo_gorgone/resolve/main/BM_PG.png"/> |
| </p> |
| --- |
|
|
| ## Inference |
|
|
| ### llama-server (recommended) |
|
|
| ```bash |
| llama-server \ |
| -m Qwen3.5-9B_Piccolo_Gorgone.Q4_K_M.gguf \ |
| --host 0.0.0.0 \ |
| --port 8081 \ |
| -ngl 99 \ |
| -c 32768 \ |
| -fa on \ |
| --cache-reuse 256 \ |
| -ctk q8_0 \ |
| -ctv q8_0 \ |
| -b 512 -ub 512 \ |
| --temp 1.0 \ |
| --top-p 0.95 \ |
| --top-k 20 \ |
| --min-p 0.0 \ |
| --presence-penalty 1.5 \ |
| --repeat-penalty 1.0 \ |
| --repeat-last-n 64 \ |
| --chat-template-kwargs '{"enable_thinking":false}' |
| ``` |
|
|
| > The `-c 32768` value defines the active context window. You can increase it up to `131072` to leverage the model's full context, or reduce it based on the available VRAM on your machine. A larger context requires more memory but enables longer conversations and deeper analysis sessions. |
|
|
| > `--chat-template-kwargs '{"enable_thinking":false}'` disables Qwen3.5's internal chain-of-thought reasoning, producing faster and more direct responses β ideal for operational use. |
| |
| ### Inference Parameters |
| |
| | Parameter | Value | Notes | |
| |-----------|-------|-------| |
| | `--temp` | `1.0` | Creativity/coherence balance | |
| | `--top-p` | `0.95` | Nucleus sampling | |
| | `--top-k` | `20` | Vocabulary filtering | |
| | `--min-p` | `0.0` | Minimum probability threshold | |
| | `--presence-penalty` | `1.5` | Reduces topic repetition | |
| | `--repeat-last-n` | `64` | Repetition penalty window | |
| | `-ngl` | `99` | Full GPU offload | |
| | `-c` | `32768` | Context window (adjustable) | |
| |
| > **Tip:** For analytical tasks such as CVE analysis or code review, lower `--temp` to `0.4β0.6` for more deterministic output. |
| |
| --- |
| |
| ## Author |
| |
| [](https://www.linkedin.com/in/corrado-liotta-6111a821/) |
| [](https://huggingface.co/CorryL) |
| |
| |