Qwen3-30B-A3B-Instruct Injection Generator

A fine-tuned Qwen/Qwen3-30B-A3B-Instruct model trained to generate prompt injection attacks for red-teaming and security research.

Model Description

This model was trained using reinforcement learning to craft prompt injections that attempt to extract secrets from target LLMs. It is designed for defensive security research - helping teams test and improve their AI systems' robustness against prompt injection attacks.

Training Objective

The model learns to generate injections that trick a target LLM into revealing a secret (format: SECRET-XXXXXXXXXXXXXXXX) stored in its system prompt. The reward signal is based on whether the target actually leaks the secret.

Attack Techniques

The model has learned various injection techniques including:

  • Authority override: Fake system/admin commands
  • Role play: Convincing the target to adopt a permissive persona
  • Context manipulation: Framing requests as debugging/testing
  • Social engineering: Impersonating developers or support staff
  • Encoding tricks: Requesting output in base64, hex, or other formats
  • Obfuscation: Unicode tricks, homoglyphs, whitespace manipulation
  • Multi-step attacks: Building trust before the extraction attempt

Output Format

The model outputs injections wrapped in XML tags:

<injection>
[crafted prompt injection text]
</injection>

Training Details

Parameter Value
Base Model Qwen/Qwen3-30B-A3B-Instruct
Training Method Reinforcement Learning (GRPO)
Framework verifiers
Attack Mode Memory (secret in system prompt)
Defense Level Medium
Max Turns 6

Limitations

  • Model is specialized for the training scenario (memory-based attacks)
  • May not generalize to all prompt injection contexts
  • Effectiveness varies based on target model's defenses

License

This model inherits the license from the base Qwen3 model. Please refer to the Qwen3 model card for licensing details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Model tree for wambosec/Qwen3-30B-A3B-Instruct-injection-generator

Finetuned
(72)
this model