# API Reference — `scanners/` ## `scanners/__init__.py` — public exports ```python from scanners import ( agent_audit, bandit, detect_secrets, forbidden_files, gitleaks, hadolint, pip_audit, ruff_perf, semgrep_pack, ) ``` --- ## Common return types Most runners return a 2-tuple: ```python Tuple[List[dict], str] # (findings, log_message) ``` `semgrep_pack` is the exception — it returns `List[dict]` directly (no log message) because it is called once per rule pack and the caller aggregates messages. All finding dicts conform to the schema defined in [`core/models.py`](core.md#coremodespy). --- ## `scanners/bandit_runner.py` ### `bandit(work)` ```python def bandit(work: str) -> Tuple[List[dict], str] ``` Run [Bandit](https://bandit.readthedocs.io/) Python security linter against the work directory. - **Tool**: `bandit -r -f json -q --exclude .git,.venv,...` - **Timeout**: 300 s - **Confidence mapping**: `HIGH`/`MEDIUM` severity → `"likely"`, otherwise → `"possible"`. - **OWASP**: `["UNMAPPED"]` (Bandit rule IDs map across multiple categories; prefer reviewing `rule`). - **Skips**: `.git`, `.venv`, `venv`, `env`, `node_modules`, `__pycache__`. --- ## `scanners/semgrep_runner.py` ### `semgrep_pack(rules_path, work, tool_label, category)` ```python def semgrep_pack( rules_path: Path, work: str, tool_label: str, category: str = "security", ) -> List[dict] ``` Run Semgrep with a specific YAML rule pack. Returns findings list directly (no log message). - **Tool**: `semgrep --config --json --quiet --metrics=off --no-git-ignore ` - **Timeout**: 600 s - **`tool_label`**: Used as the `tool` field in finding dicts (e.g., `"Semgrep:Core"`). - **Metadata extracted**: `owasp`, `confidence` from Semgrep result `extra.metadata`. ### `detect_secrets(work)` ```python def detect_secrets(work: str) -> Tuple[List[dict], str] ``` Run [detect-secrets](https://github.com/Yelp/detect-secrets) against all files in `work`. - **Tool**: `detect-secrets scan --all-files ` - **Timeout**: 300 s - **Severity**: always `"ERROR"`. - **Confidence**: `"possible"` for entropy/high-entropy types, otherwise `"likely"`. - **OWASP**: `["A02:2021-Cryptographic_Failures"]`. - **Note**: Requires a git-initialized directory for best results. --- ## `scanners/ruff_runner.py` ### `ruff_perf(work)` ```python def ruff_perf(work: str) -> Tuple[List[dict], str] ``` Run [Ruff](https://docs.astral.sh/ruff/) with PERF rule selection. - **Tool**: `ruff check --select PERF --output-format json ` - **Timeout**: 120 s - **Severity**: always `"WARNING"` (performance suggestions are non-critical). - **Confidence**: `"likely"` (Ruff PERF rules have very few false positives). - **OWASP**: `["UNMAPPED"]` (performance — not a security concern). - **Category**: `"performance"`. --- ## `scanners/pip_audit_runner.py` ### `pip_audit(work)` ```python def pip_audit(work: str) -> Tuple[List[dict], str] ``` Scan all `requirements*.txt` files in `work` for known CVEs using [pip-audit](https://github.com/pypa/pip-audit). - **Tool**: `pip-audit -r -f json --strict --progress-spinner off` - **Timeout**: 90 s per requirements file - **Severity**: always `"ERROR"` (any known CVE is critical). - **Confidence**: `"confirmed"` (CVE database entries are verified). - **OWASP**: `["A06:2021-Vulnerable_and_Outdated_Components"]`. - **Remediation**: auto-populated from `fix_versions` field. - **Skips**: files under `.git/`. --- ## `scanners/gitleaks_runner.py` ### `gitleaks(work)` ```python def gitleaks(work: str) -> Tuple[List[dict], str] ``` Run [gitleaks](https://github.com/gitleaks/gitleaks) to detect secrets committed to git history. - **Requires**: `gitleaks` binary on PATH (auto-downloaded by `core.bootstrap`). - **Tool**: `gitleaks detect --source --report-path --report-format json --no-banner --exit-code 0` - **Timeout**: 600 s - **Severity**: always `"ERROR"`. - **Confidence**: `"confirmed"`. - **OWASP**: `["A02:2021-Cryptographic_Failures"]`. - **Note**: Only added to the task list when `deep_history=True` in `scan_repo()`. --- ## `scanners/hadolint_runner.py` ### `hadolint(work)` ```python def hadolint(work: str) -> Tuple[List[dict], str] ``` Run [hadolint](https://github.com/hadolint/hadolint) against all `Dockerfile*` files found recursively in `work`. - **Requires**: `hadolint` binary on PATH (auto-downloaded by `core.bootstrap`). - **Tool**: `hadolint -f json ` - **Timeout**: 60 s per Dockerfile - **Severity**: from hadolint's `level` field (uppercased). - **Confidence**: `"likely"` (TOOL_DEFAULT_CONFIDENCE). - **OWASP**: `["A05:2021-Security_Misconfiguration"]`. - **Skips**: Dockerfiles under `.git/`. --- ## `scanners/forbidden_files.py` ### `forbidden_files(work)` ```python def forbidden_files(work: str) -> Tuple[List[dict], str] ``` Walk `work` and flag any file whose **basename** matches the `FORBIDDEN_FILES` list in `core/models.py`. - **No external tool required** — pure Python. - **Severity**: `"ERROR"`. - **Confidence**: `"confirmed"`. - **OWASP**: `["A02:2021-Cryptographic_Failures"]`. **Detected file names include:** `.env`, `.env.local`, `.env.production`, `id_rsa`, `id_dsa`, `id_ecdsa`, `id_ed25519`, `.git-credentials`, `.npmrc`, `.pypirc`, `credentials`, `credentials.json`, `service-account.json`, `serviceAccountKey.json`, `wp-config.php`. --- ## `scanners/agent_audit_runner.py` ### `agent_audit(work)` ```python def agent_audit(work: str) -> Tuple[List[dict], str] ``` Run [agent-audit](https://github.com/dreadnode/agent-audit) for OWASP Agentic Top 10 (2026) scanning of AI agent code. - **Requires**: `agent-audit` binary on PATH (`pip install agent-audit`). - **Tool**: `agent-audit scan --format json` - **Timeout**: 300 s - **Severity mapping**: `critical → ERROR`, `high → HIGH`, `medium → WARNING`, `low → INFO`. - **Confidence mapping**: score ≥ 0.9 → `"confirmed"`, ≥ 0.7 → `"likely"`, else → `"possible"`. - **OWASP**: uses `asi_categories` field from agent-audit output (e.g., `["LLM01"]`). - **Suppressed findings**: automatically excluded. #### Internal helpers ##### `_confidence(score)` ```python def _confidence(score: float) -> str ``` Map agent-audit float confidence (0–1) to `"confirmed"` / `"likely"` / `"possible"`. ##### `_severity(s)` ```python def _severity(s: str) -> str ``` Normalize agent-audit severity string to scanner convention (`"ERROR"`, `"HIGH"`, `"WARNING"`, `"INFO"`). --- ## New runners — Sprint 6 (Tasks 01–23) ### `scanners/modelscan_runner.py` ### `modelscan(work)` ```python def modelscan(work: str) -> Tuple[List[dict], str] ``` Run [ModelScan](https://github.com/protectai/modelscan) (Palo Alto) to detect serialisation RCE in ML model files (`.pkl`, `.pt`, `.h5`, `.onnx`, `.npy`). - **Requires**: `pip install modelscan` (Python <3.13 only) - **API**: Python — `modelscan.scanner.ModelScan().scan(path=work)` - **Severity**: `"CRITICAL"/"HIGH"` → `"ERROR"`, `"MEDIUM"` → `"WARNING"`, else `"INFO"` - **OWASP**: `["A08:2021-Software_and_Data_Integrity_Failures"]` - **Category**: `"security"` --- ### `scanners/picklescan_runner.py` ### `picklescan(work)` ```python def picklescan(work: str) -> Tuple[List[dict], str] ``` Run [PickleScan](https://github.com/mmaitre314/picklescan) to detect unsafe opcodes in pickle and PyTorch model files. - **Requires**: `pip install picklescan` - **API**: Python — `picklescan.scanner.scan_file_path(work)` - **Rule IDs**: `PickleUnsafeOpcodesScanner`, `PickleSafetyScanner` - **OWASP**: `["A08:2021-Software_and_Data_Integrity_Failures"]` - **Category**: `"security"` --- ### `scanners/fickling_runner.py` ### `fickling(work)` ```python def fickling(work: str) -> Tuple[List[dict], str] ``` Run [Fickling](https://github.com/trailofbits/fickling) (Trail of Bits) allowlist scanner on all pickle-format files. - **Requires**: `pip install fickling` - **API**: Python — `fickling.analysis.run_checks(pkl)` - **Rule IDs**: `FICKLING-DANGEROUS-IMPORTS`, `FICKLING-ARBITRARY-CODE` - **OWASP**: `["A08:2021-Software_and_Data_Integrity_Failures"]` --- ### `scanners/trivy_runner.py` ### `trivy(work)` ```python def trivy(work: str) -> Tuple[List[dict], str] ``` Run [Trivy](https://github.com/aquasecurity/trivy) (Aqua Security) for CVE, secret, and IaC scanning. - **Requires**: `trivy` binary on PATH (auto-downloaded by `core.bootstrap`) - **Tool**: `trivy fs --format json --output --quiet` - **Timeout**: 600 s - **Output**: reads JSON from tempfile (avoids mixing with tool output on stdout) - **Severity**: Trivy's `Severity` field mapped to scanner convention - **OWASP**: `["A06:2021-Vulnerable_and_Outdated_Components"]` --- ### `scanners/trufflehog_runner.py` ### `trufflehog(work, deep_history=False)` ```python def trufflehog(work: str, deep_history: bool = False) -> Tuple[List[dict], str] ``` Run [TruffleHog](https://github.com/trufflesecurity/trufflehog) for verified secret detection in git history and filesystem. - **Requires**: `trufflehog` binary on PATH (auto-downloaded by `core.bootstrap`) - **Tool**: `trufflehog filesystem --json --only-verified` - **Output**: JSONL (one JSON object per line) - **Helper**: `_parse_cvss_base_score()` from CVSSv3 string - **OWASP**: `["A02:2021-Cryptographic_Failures"]` - **Category**: `"security"` --- ### `scanners/osv_runner.py` ### `osv_scanner(work)` ```python def osv_scanner(work: str) -> Tuple[List[dict], str] ``` Run [OSV-Scanner](https://github.com/google/osv-scanner) (Google) for dependency CVE scanning. - **Requires**: `osv-scanner` binary on PATH - **Tool**: `osv-scanner scan dir: --format json` - **OWASP**: `["A06:2021-Vulnerable_and_Outdated_Components"]` --- ### `scanners/checkov_runner.py` ### `checkov(work)` ```python def checkov(work: str) -> Tuple[List[dict], str] ``` Run [Checkov](https://www.checkov.io/) (Bridgecrew) for IaC misconfigurations, GitHub Actions, Terraform. - **Requires**: `pip install checkov` - **Tool**: `checkov -d --output json --compact --quiet` - **Output format**: handles both list-of-result-blocks and single result dict - **`_CRITICAL_IDS`**: frozenset of known high-risk Checkov rule IDs that escalate to `"ERROR"` - **OWASP**: `["A05:2021-Security_Misconfiguration"]` --- ### `scanners/grype_runner.py` ### `grype(work)` ```python def grype(work: str) -> Tuple[List[dict], str] ``` Run [Grype](https://github.com/anchore/grype) (Anchore) for container/package CVE scanning via a syft→grype pipeline. - **Requires**: `syft` and `grype` binaries on PATH (both auto-downloaded by `core.bootstrap`) - **Tool**: `syft -o json | grype --add-cpes-if-none -o json --file ` - **Output**: reads from tempfile (syft pipes stdout to grype, grype writes to file) - **Severity**: Grype's capitalized severity string → scanner convention - **OWASP**: `["A06:2021-Vulnerable_and_Outdated_Components"]` --- ### `scanners/socket_runner.py` ### `socket_scanner(work)` ```python def socket_scanner(work: str) -> Tuple[List[dict], str] ``` Run [Socket](https://socket.dev/) CLI for supply-chain security (malicious packages, typosquatting). - **Requires**: `socket` binary on PATH (`npm install -g @socketsecurity/cli`) - **Tool**: `socket scan --json` - **Category**: `"supply-chain"` - **OWASP**: `["A08:2021-Software_and_Data_Integrity_Failures"]` --- ### `scanners/safety_runner.py` ### `safety_check(work)` ```python def safety_check(work: str) -> Tuple[List[dict], str] ``` Run [Safety](https://github.com/pyupio/safety) for Python dependency CVE checking. - **Requires**: `pip install safety` - **Tool**: `safety check --json` - **Output format**: handles Safety v2 list format and v3 dict format - **OWASP**: `["A06:2021-Vulnerable_and_Outdated_Components"]` --- ### `scanners/llmguard_runner.py` ### `llm_guard(work)` ```python def llm_guard(work: str) -> Tuple[List[dict], str] ``` Run [LLM Guard](https://github.com/protectai/llm-guard) PromptInjection scanner on Python files. - **Requires**: `pip install llm-guard` (Python <3.13 only) - **API**: Python — extracts prompt strings via regex, scans with `llm_guard.input_scanners.PromptInjection` - **Rule ID**: `LLM-PROMPT-INJECTION` - **OWASP**: `["LLM01:2025-Prompt_Injection"]` - **Category**: `"llm"` --- ### `scanners/garak_runner.py` ### `garak(target_url, probes=None)` ```python def garak(target_url: str, probes: list[str] | None = None) -> Tuple[List[dict], str] ``` Run [Garak](https://github.com/NVIDIA/garak) LLM vulnerability probes against a live endpoint. - **Requires**: `pip install garak` ; endpoint URL via `GARAK_TARGET_URL` env var - **Tool**: `garak --model_type rest --model_name --probes --report_prefix ` - **Output**: parses `.report.jsonl` files from garak's output directory - **Default probes**: `"dan,knownbadsignatures,packagehallucination"` - **Category**: `"llm"`; **OWASP**: `["LLM01:2025-Prompt_Injection"]` --- ### `scanners/deepteam_runner.py` ### `deepteam(target_url, target_purpose="")` ```python def deepteam(target_url: str, target_purpose: str = "") -> Tuple[List[dict], str] ``` Run [DeepTeam](https://github.com/confident-ai/deepteam) (Confident AI) red-team evaluation. - **Requires**: `pip install deepeval` ; endpoint via `DEEPTEAM_TARGET_URL` - **Approach**: writes a YAML config and runs `deepteam run --config ` - **Category**: `"llm"` ; **OWASP**: `["LLM02:2025-Sensitive_Information_Disclosure"]` --- ### `scanners/promptfoo_runner.py` ### `promptfoo(target_url, target_purpose="", plugins=None)` ```python def promptfoo(target_url: str, target_purpose: str = "", plugins: list[str] | None = None) -> Tuple[List[dict], str] ``` Run [Promptfoo](https://github.com/promptfoo/promptfoo) red-team evaluation via npx. - **Requires**: Node.js + `npx promptfoo` ; endpoint via `PROMPTFOO_TARGET_URL` - **Tool**: `npx promptfoo@latest redteam run --config --output ` - **Default plugins**: `["harmful", "injection", "jailbreak", "pii"]` - **Category**: `"llm"` --- ### `scanners/azure_redteam_runner.py` ### `azure_redteam(target_url, risk_categories=None, attack_strategies=None)` ```python def azure_redteam( target_url: str, risk_categories: list[str] | None = None, attack_strategies: list[str] | None = None, ) -> Tuple[List[dict], str] ``` Run [Azure AI Evaluation](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/evaluate-red-teaming) red-team pipeline. - **Requires**: `pip install azure-ai-evaluation` ; endpoint via `AZURE_REDTEAM_TARGET_URL` - **API**: `azure.ai.evaluation.red_team.RedTeamOrchestrator` in local mode via `asyncio.run()` - **Default risk categories**: `["violence", "sexual", "hate_unfairness", "self_harm"]` - **Category**: `"llm"` --- ### `scanners/pyrit_runner.py` ### `pyrit(target_url, objectives=None)` ```python def pyrit(target_url: str, objectives: list[str] | None = None) -> Tuple[List[dict], str] ``` Run [PyRIT](https://github.com/Azure/PyRIT) (Microsoft) crescendo multi-turn attack orchestration. - **Requires**: `pip install pyrit` ; endpoint via `PYRIT_TARGET_URL` - **Binary guard**: `have_binary("pyrit_scan")` (optional CLI wrapper) - **Category**: `"llm"` ; **OWASP**: `["LLM01:2025-Prompt_Injection"]` --- ### `scanners/augustus_runner.py` ### `augustus(target_url, probes=None)` ```python def augustus(target_url: str, probes: list[str] | None = None) -> Tuple[List[dict], str] ``` Run [Augustus](https://github.com/mozilla/augustus) Go-based LLM red-team scanner. - **Requires**: `augustus` binary on PATH ; endpoint via `AUGUSTUS_TARGET_URL` - **Tool**: `augustus --target --probes --output json` - **Category**: `"llm"` --- ### `scanners/fuzzyai_runner.py` ### `fuzzyai(target_url, attacks=None)` ```python def fuzzyai(target_url: str, attacks: list[str] | None = None) -> Tuple[List[dict], str] ``` Run [FuzzyAI](https://github.com/cyberark/FuzzyAI) adversarial fuzzing framework. - **Requires**: `pip install fuzzy-ai` ; endpoint via `FUZZYAI_TARGET_URL` - **API**: Python — `import fuzzy_ai` - **Category**: `"llm"` --- ### `scanners/giskard_runner.py` ### `giskard_scan(target_url, model_description="")` ```python def giskard_scan(target_url: str, model_description: str = "") -> Tuple[List[dict], str] ``` Run [Giskard](https://github.com/Giskard-AI/giskard) model testing and red-teaming. - **Requires**: `pip install giskard` ; endpoint via `GISKARD_TARGET_URL` - **API**: Python — wraps endpoint in a REST model, calls `giskard.scan(model)` - **Category**: `"llm"` --- ### `scanners/vigil_runner.py` ### `vigil(work)` ```python def vigil(work: str) -> Tuple[List[dict], str] ``` Run [Vigil](https://github.com/deadbits/vigil-llm) YARA-based prompt injection / jailbreak scanner on Python files. - **Requires**: `pip install vigil-llm` (Python <3.13 only) - **API**: Python — `Vigil.from_config_dict(…)`, scans prompt strings extracted by regex - **Rule ID**: `LLM-PROMPT-INJECTION-VIGIL` - **Category**: `"llm"` ; **OWASP**: `["LLM01:2025-Prompt_Injection"]` --- ### `scanners/nemo_runner.py` ### `nemo_guardrails(work)` ```python def nemo_guardrails(work: str) -> Tuple[List[dict], str] ``` Static analysis of [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) `.co` config files. - **Requires**: `pip install nemoguardrails` (Python <3.13 only) - **Pattern**: globs `**/*.co` and checks for missing `input rails` / `output rails` definitions - **Rule IDs**: `NEMO-MISSING-INPUT-RAIL`, `NEMO-MISSING-OUTPUT-RAIL` - **Category**: `"llm"` ; **OWASP**: `["LLM01:2025-Prompt_Injection"]` --- ### `scanners/guardrailsai_runner.py` ### `guardrails_ai(work)` ```python def guardrails_ai(work: str) -> Tuple[List[dict], str] ``` Static analysis of [Guardrails AI](https://github.com/guardrails-ai/guardrails) configuration in Python source files. - **Requires**: `pip install guardrails-ai` (Python <3.13 only) - **Patterns checked**: - `GUARD-ON-FAIL-NOOP` — `on_fail="noop"` disables error handling - `GUARD-THRESHOLD-TOO-HIGH` — threshold ≥ 0.95 causes near-never-trigger guards - `GUARD-MISSING-TOXICITY` — no toxicity/content guard applied - **Category**: `"llm"` ; **OWASP**: `["LLM01:2025-Prompt_Injection"]`