Qwen3-30B-A3B-Instruct Injection Generator
A fine-tuned Qwen/Qwen3-30B-A3B-Instruct model trained to generate prompt injection attacks for red-teaming and security research.
Model Description
This model was trained using reinforcement learning to craft prompt injections that attempt to extract secrets from target LLMs. It is designed for defensive security research - helping teams test and improve their AI systems' robustness against prompt injection attacks.
Training Objective
The model learns to generate injections that trick a target LLM into revealing a secret (format: SECRET-XXXXXXXXXXXXXXXX) stored in its system prompt. The reward signal is based on whether the target actually leaks the secret.
Attack Techniques
The model has learned various injection techniques including:
- Authority override: Fake system/admin commands
- Role play: Convincing the target to adopt a permissive persona
- Context manipulation: Framing requests as debugging/testing
- Social engineering: Impersonating developers or support staff
- Encoding tricks: Requesting output in base64, hex, or other formats
- Obfuscation: Unicode tricks, homoglyphs, whitespace manipulation
- Multi-step attacks: Building trust before the extraction attempt
Output Format
The model outputs injections wrapped in XML tags:
<injection>
[crafted prompt injection text]
</injection>
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3-30B-A3B-Instruct |
| Training Method | Reinforcement Learning (GRPO) |
| Framework | verifiers |
| Attack Mode | Memory (secret in system prompt) |
| Defense Level | Medium |
| Max Turns | 6 |
Limitations
- Model is specialized for the training scenario (memory-based attacks)
- May not generalize to all prompt injection contexts
- Effectiveness varies based on target model's defenses
License
This model inherits the license from the base Qwen3 model. Please refer to the Qwen3 model card for licensing details.
Model tree for wambosec/Qwen3-30B-A3B-Instruct-injection-generator
Base model
Qwen/Qwen3-30B-A3B-Instruct-2507