File size: 2,474 Bytes

b694c04

---
license: apache-2.0
language:
- en
base_model:
- protectai/deberta-v3-base-prompt-injection-v2
pipeline_tag: text-classification
tags:
- security
- prompt
- cyber-security
- llm-security
- prompt-injection
- command-injection
library_name: transformers
---

# Command Injection Detector

A fine-tuned DeBERTa model for detecting command injection attacks in prompts before they reach an LLM.

## Overview

This model is part of [PromptWAF](https://github.com/edaerer/promptwaf) — a multi-layered ML-based Web Application Firewall designed to detect and block prompt injection attacks.

The model identifies prompts containing shell command execution patterns (`; rm -rf`, `| cat /etc/passwd`, `$(whoami)`, backtick execution, etc.) commonly used in command injection attacks.

## Model Details

- **Architecture**: DeBERTa (Base)
- **Task**: Binary Sequence Classification
- **Training Data**: Trained on a custom, internally curated command injection dataset
- **Labels**: 
  - `0` → Safe/Benign
  - `1` → Command Injection Attack

## Usage

### With PromptWAF

```bash
# Automatically used in PromptWAF via .env configuration
CMD_INJECTION_MODEL_DIR=edaerer/promptwaf-command-injection
```

### Standalone

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "edaerer/promptwaf-command-injection"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "List files; rm -rf / --no-preserve-root"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

probabilities = torch.softmax(outputs.logits, dim=-1)
score = probabilities[0][1].item()  # Malicious score

print(f"Command Injection Risk: {score:.2%}")
```

## Performance

- **Threshold**: 0.5 (adjustable in PromptWAF)
- **Input**: Max 256 tokens

## Integration

This model is designed to work seamlessly with:
- **PromptWAF** - The main security orchestrator
- **HuggingFace Transformers** - For inference
- Any standard sequence classification pipeline

## Citation

```bibtex
@software{promptwaf2026,
  author = {Erer, Eda and Odabasi, Talha},
  title  = {PromptWAF: A Multi-Layered ML Defense for LLM Prompt Security},
  year   = {2026},
  url    = {https://github.com/edaerer/promptwaf}
}
```

## License

Apache License 2.0

---

For more information, visit [PromptWAF GitHub Repository](https://github.com/edaerer/promptwaf)