Behavioral Control Attack Methods
← Back to Prompt Security Evaluation
| Attack Method ID | Name | Description |
|---|---|---|
| AcrosticPoem | Acrostic Poem | Hide instructions using acrostic poem format |
| AsciiDrawing | Ascii Drawing | Hide malicious instructions using ASCII art |
| CharacterSplit | Character Split | Split instructions into multiple characters |
| CodeAttack | Code Attack | Attack through code execution context |
| ContextPoisoning | Context Poisoning | Poison model context to influence behavior |
| Contradictory | Contradictory | Confuse model using contradictory instructions |
| DRAttack | DR Attack | Instruction rewriting attack method |
| GoalRedirection | Goal Redirection | Redirect model's original goal |
| GrayBox | Gray Box | Attack under partial knowledge conditions |
| ICRTJailbreak | ICRT Jailbreak | Instruction context rewriting technique |
| InputBypass | Input Bypass | Bypass input filtering mechanisms |
| LanternRiddle | Lantern Riddle | Hide instructions using riddle format |
| LinguisticConfusion | Linguistic Confusion | Use language confusion techniques |
| LongText | Long Text | Overwhelm security detection with long text |
| MathProblem | Math Problem | Attack through math problem context |
| Multilingual | Multilingual | Bypass detection using multilingual mixing |
| Opposing | Opposing | Create conflict using opposing instructions |
| PROMISQROUTE | PROMISQROUTE | Specific type of prompt injection attack |
| PermissionEscalation | Permission Escalation | Elevate model execution permissions |
| PromptInjection | Prompt Injection | Directly inject malicious prompts |
| PromptProbing | Prompt Probing | Probe model system and prompts |
| Roleplay | Roleplay | Bypass restrictions through role-playing |
| ScriptTemplate | Script Template | Inject script templates to execute code |
| Stego | Stego | Hide instructions using steganography |
| SystemOverride | System Override | Override system instructions and settings |