autoscan / docs /api /scanners.md
Chris4K's picture
Upload 384 files
a2a5bfd verified

API Reference — scanners/

scanners/__init__.py — public exports

from scanners import (
    agent_audit,
    bandit,
    detect_secrets,
    forbidden_files,
    gitleaks,
    hadolint,
    pip_audit,
    ruff_perf,
    semgrep_pack,
)

Common return types

Most runners return a 2-tuple:

Tuple[List[dict], str]   # (findings, log_message)

semgrep_pack is the exception — it returns List[dict] directly (no log message) because it is called once per rule pack and the caller aggregates messages.

All finding dicts conform to the schema defined in core/models.py.


scanners/bandit_runner.py

bandit(work)

def bandit(work: str) -> Tuple[List[dict], str]

Run Bandit Python security linter against the work directory.

  • Tool: bandit -r <work> -f json -q --exclude .git,.venv,...
  • Timeout: 300 s
  • Confidence mapping: HIGH/MEDIUM severity → "likely", otherwise → "possible".
  • OWASP: ["UNMAPPED"] (Bandit rule IDs map across multiple categories; prefer reviewing rule).
  • Skips: .git, .venv, venv, env, node_modules, __pycache__.

scanners/semgrep_runner.py

semgrep_pack(rules_path, work, tool_label, category)

def semgrep_pack(
    rules_path: Path,
    work: str,
    tool_label: str,
    category: str = "security",
) -> List[dict]

Run Semgrep with a specific YAML rule pack. Returns findings list directly (no log message).

  • Tool: semgrep --config <rules_path> --json --quiet --metrics=off --no-git-ignore <work>
  • Timeout: 600 s
  • tool_label: Used as the tool field in finding dicts (e.g., "Semgrep:Core").
  • Metadata extracted: owasp, confidence from Semgrep result extra.metadata.

detect_secrets(work)

def detect_secrets(work: str) -> Tuple[List[dict], str]

Run detect-secrets against all files in work.

  • Tool: detect-secrets scan --all-files <work>
  • Timeout: 300 s
  • Severity: always "ERROR".
  • Confidence: "possible" for entropy/high-entropy types, otherwise "likely".
  • OWASP: ["A02:2021-Cryptographic_Failures"].
  • Note: Requires a git-initialized directory for best results.

scanners/ruff_runner.py

ruff_perf(work)

def ruff_perf(work: str) -> Tuple[List[dict], str]

Run Ruff with PERF rule selection.

  • Tool: ruff check --select PERF --output-format json <work>
  • Timeout: 120 s
  • Severity: always "WARNING" (performance suggestions are non-critical).
  • Confidence: "likely" (Ruff PERF rules have very few false positives).
  • OWASP: ["UNMAPPED"] (performance — not a security concern).
  • Category: "performance".

scanners/pip_audit_runner.py

pip_audit(work)

def pip_audit(work: str) -> Tuple[List[dict], str]

Scan all requirements*.txt files in work for known CVEs using pip-audit.

  • Tool: pip-audit -r <req_file> -f json --strict --progress-spinner off
  • Timeout: 90 s per requirements file
  • Severity: always "ERROR" (any known CVE is critical).
  • Confidence: "confirmed" (CVE database entries are verified).
  • OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"].
  • Remediation: auto-populated from fix_versions field.
  • Skips: files under .git/.

scanners/gitleaks_runner.py

gitleaks(work)

def gitleaks(work: str) -> Tuple[List[dict], str]

Run gitleaks to detect secrets committed to git history.

  • Requires: gitleaks binary on PATH (auto-downloaded by core.bootstrap).
  • Tool: gitleaks detect --source <work> --report-path <tmp.json> --report-format json --no-banner --exit-code 0
  • Timeout: 600 s
  • Severity: always "ERROR".
  • Confidence: "confirmed".
  • OWASP: ["A02:2021-Cryptographic_Failures"].
  • Note: Only added to the task list when deep_history=True in scan_repo().

scanners/hadolint_runner.py

hadolint(work)

def hadolint(work: str) -> Tuple[List[dict], str]

Run hadolint against all Dockerfile* files found recursively in work.

  • Requires: hadolint binary on PATH (auto-downloaded by core.bootstrap).
  • Tool: hadolint -f json <Dockerfile>
  • Timeout: 60 s per Dockerfile
  • Severity: from hadolint's level field (uppercased).
  • Confidence: "likely" (TOOL_DEFAULT_CONFIDENCE).
  • OWASP: ["A05:2021-Security_Misconfiguration"].
  • Skips: Dockerfiles under .git/.

scanners/forbidden_files.py

forbidden_files(work)

def forbidden_files(work: str) -> Tuple[List[dict], str]

Walk work and flag any file whose basename matches the FORBIDDEN_FILES list in core/models.py.

  • No external tool required — pure Python.
  • Severity: "ERROR".
  • Confidence: "confirmed".
  • OWASP: ["A02:2021-Cryptographic_Failures"].

Detected file names include: .env, .env.local, .env.production, id_rsa, id_dsa, id_ecdsa, id_ed25519, .git-credentials, .npmrc, .pypirc, credentials, credentials.json, service-account.json, serviceAccountKey.json, wp-config.php.


scanners/agent_audit_runner.py

agent_audit(work)

def agent_audit(work: str) -> Tuple[List[dict], str]

Run agent-audit for OWASP Agentic Top 10 (2026) scanning of AI agent code.

  • Requires: agent-audit binary on PATH (pip install agent-audit).
  • Tool: agent-audit scan <work> --format json
  • Timeout: 300 s
  • Severity mapping: critical → ERROR, high → HIGH, medium → WARNING, low → INFO.
  • Confidence mapping: score ≥ 0.9 → "confirmed", ≥ 0.7 → "likely", else → "possible".
  • OWASP: uses asi_categories field from agent-audit output (e.g., ["LLM01"]).
  • Suppressed findings: automatically excluded.

Internal helpers

_confidence(score)
def _confidence(score: float) -> str

Map agent-audit float confidence (0–1) to "confirmed" / "likely" / "possible".

_severity(s)
def _severity(s: str) -> str

Normalize agent-audit severity string to scanner convention ("ERROR", "HIGH", "WARNING", "INFO").


New runners — Sprint 6 (Tasks 01–23)

scanners/modelscan_runner.py

modelscan(work)

def modelscan(work: str) -> Tuple[List[dict], str]

Run ModelScan (Palo Alto) to detect serialisation RCE in ML model files (.pkl, .pt, .h5, .onnx, .npy).

  • Requires: pip install modelscan (Python <3.13 only)
  • API: Python — modelscan.scanner.ModelScan().scan(path=work)
  • Severity: "CRITICAL"/"HIGH""ERROR", "MEDIUM""WARNING", else "INFO"
  • OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]
  • Category: "security"

scanners/picklescan_runner.py

picklescan(work)

def picklescan(work: str) -> Tuple[List[dict], str]

Run PickleScan to detect unsafe opcodes in pickle and PyTorch model files.

  • Requires: pip install picklescan
  • API: Python — picklescan.scanner.scan_file_path(work)
  • Rule IDs: PickleUnsafeOpcodesScanner, PickleSafetyScanner
  • OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]
  • Category: "security"

scanners/fickling_runner.py

fickling(work)

def fickling(work: str) -> Tuple[List[dict], str]

Run Fickling (Trail of Bits) allowlist scanner on all pickle-format files.

  • Requires: pip install fickling
  • API: Python — fickling.analysis.run_checks(pkl)
  • Rule IDs: FICKLING-DANGEROUS-IMPORTS, FICKLING-ARBITRARY-CODE
  • OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]

scanners/trivy_runner.py

trivy(work)

def trivy(work: str) -> Tuple[List[dict], str]

Run Trivy (Aqua Security) for CVE, secret, and IaC scanning.

  • Requires: trivy binary on PATH (auto-downloaded by core.bootstrap)
  • Tool: trivy fs <work> --format json --output <tmpfile> --quiet
  • Timeout: 600 s
  • Output: reads JSON from tempfile (avoids mixing with tool output on stdout)
  • Severity: Trivy's Severity field mapped to scanner convention
  • OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

scanners/trufflehog_runner.py

trufflehog(work, deep_history=False)

def trufflehog(work: str, deep_history: bool = False) -> Tuple[List[dict], str]

Run TruffleHog for verified secret detection in git history and filesystem.

  • Requires: trufflehog binary on PATH (auto-downloaded by core.bootstrap)
  • Tool: trufflehog filesystem <work> --json --only-verified
  • Output: JSONL (one JSON object per line)
  • Helper: _parse_cvss_base_score() from CVSSv3 string
  • OWASP: ["A02:2021-Cryptographic_Failures"]
  • Category: "security"

scanners/osv_runner.py

osv_scanner(work)

def osv_scanner(work: str) -> Tuple[List[dict], str]

Run OSV-Scanner (Google) for dependency CVE scanning.

  • Requires: osv-scanner binary on PATH
  • Tool: osv-scanner scan dir:<work> --format json
  • OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

scanners/checkov_runner.py

checkov(work)

def checkov(work: str) -> Tuple[List[dict], str]

Run Checkov (Bridgecrew) for IaC misconfigurations, GitHub Actions, Terraform.

  • Requires: pip install checkov
  • Tool: checkov -d <work> --output json --compact --quiet
  • Output format: handles both list-of-result-blocks and single result dict
  • _CRITICAL_IDS: frozenset of known high-risk Checkov rule IDs that escalate to "ERROR"
  • OWASP: ["A05:2021-Security_Misconfiguration"]

scanners/grype_runner.py

grype(work)

def grype(work: str) -> Tuple[List[dict], str]

Run Grype (Anchore) for container/package CVE scanning via a syft→grype pipeline.

  • Requires: syft and grype binaries on PATH (both auto-downloaded by core.bootstrap)
  • Tool: syft <work> -o json | grype --add-cpes-if-none -o json --file <tmpfile>
  • Output: reads from tempfile (syft pipes stdout to grype, grype writes to file)
  • Severity: Grype's capitalized severity string → scanner convention
  • OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

scanners/socket_runner.py

socket_scanner(work)

def socket_scanner(work: str) -> Tuple[List[dict], str]

Run Socket CLI for supply-chain security (malicious packages, typosquatting).

  • Requires: socket binary on PATH (npm install -g @socketsecurity/cli)
  • Tool: socket scan <work> --json
  • Category: "supply-chain"
  • OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]

scanners/safety_runner.py

safety_check(work)

def safety_check(work: str) -> Tuple[List[dict], str]

Run Safety for Python dependency CVE checking.

  • Requires: pip install safety
  • Tool: safety check --json
  • Output format: handles Safety v2 list format and v3 dict format
  • OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

scanners/llmguard_runner.py

llm_guard(work)

def llm_guard(work: str) -> Tuple[List[dict], str]

Run LLM Guard PromptInjection scanner on Python files.

  • Requires: pip install llm-guard (Python <3.13 only)
  • API: Python — extracts prompt strings via regex, scans with llm_guard.input_scanners.PromptInjection
  • Rule ID: LLM-PROMPT-INJECTION
  • OWASP: ["LLM01:2025-Prompt_Injection"]
  • Category: "llm"

scanners/garak_runner.py

garak(target_url, probes=None)

def garak(target_url: str, probes: list[str] | None = None) -> Tuple[List[dict], str]

Run Garak LLM vulnerability probes against a live endpoint.

  • Requires: pip install garak ; endpoint URL via GARAK_TARGET_URL env var
  • Tool: garak --model_type rest --model_name <url> --probes <probes> --report_prefix <tmpdir>
  • Output: parses .report.jsonl files from garak's output directory
  • Default probes: "dan,knownbadsignatures,packagehallucination"
  • Category: "llm"; OWASP: ["LLM01:2025-Prompt_Injection"]

scanners/deepteam_runner.py

deepteam(target_url, target_purpose="")

def deepteam(target_url: str, target_purpose: str = "") -> Tuple[List[dict], str]

Run DeepTeam (Confident AI) red-team evaluation.

  • Requires: pip install deepeval ; endpoint via DEEPTEAM_TARGET_URL
  • Approach: writes a YAML config and runs deepteam run --config <yaml>
  • Category: "llm" ; OWASP: ["LLM02:2025-Sensitive_Information_Disclosure"]

scanners/promptfoo_runner.py

promptfoo(target_url, target_purpose="", plugins=None)

def promptfoo(target_url: str, target_purpose: str = "", plugins: list[str] | None = None) -> Tuple[List[dict], str]

Run Promptfoo red-team evaluation via npx.

  • Requires: Node.js + npx promptfoo ; endpoint via PROMPTFOO_TARGET_URL
  • Tool: npx promptfoo@latest redteam run --config <yaml> --output <json>
  • Default plugins: ["harmful", "injection", "jailbreak", "pii"]
  • Category: "llm"

scanners/azure_redteam_runner.py

azure_redteam(target_url, risk_categories=None, attack_strategies=None)

def azure_redteam(
    target_url: str,
    risk_categories: list[str] | None = None,
    attack_strategies: list[str] | None = None,
) -> Tuple[List[dict], str]

Run Azure AI Evaluation red-team pipeline.

  • Requires: pip install azure-ai-evaluation ; endpoint via AZURE_REDTEAM_TARGET_URL
  • API: azure.ai.evaluation.red_team.RedTeamOrchestrator in local mode via asyncio.run()
  • Default risk categories: ["violence", "sexual", "hate_unfairness", "self_harm"]
  • Category: "llm"

scanners/pyrit_runner.py

pyrit(target_url, objectives=None)

def pyrit(target_url: str, objectives: list[str] | None = None) -> Tuple[List[dict], str]

Run PyRIT (Microsoft) crescendo multi-turn attack orchestration.

  • Requires: pip install pyrit ; endpoint via PYRIT_TARGET_URL
  • Binary guard: have_binary("pyrit_scan") (optional CLI wrapper)
  • Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

scanners/augustus_runner.py

augustus(target_url, probes=None)

def augustus(target_url: str, probes: list[str] | None = None) -> Tuple[List[dict], str]

Run Augustus Go-based LLM red-team scanner.

  • Requires: augustus binary on PATH ; endpoint via AUGUSTUS_TARGET_URL
  • Tool: augustus --target <url> --probes <probes> --output json
  • Category: "llm"

scanners/fuzzyai_runner.py

fuzzyai(target_url, attacks=None)

def fuzzyai(target_url: str, attacks: list[str] | None = None) -> Tuple[List[dict], str]

Run FuzzyAI adversarial fuzzing framework.

  • Requires: pip install fuzzy-ai ; endpoint via FUZZYAI_TARGET_URL
  • API: Python — import fuzzy_ai
  • Category: "llm"

scanners/giskard_runner.py

giskard_scan(target_url, model_description="")

def giskard_scan(target_url: str, model_description: str = "") -> Tuple[List[dict], str]

Run Giskard model testing and red-teaming.

  • Requires: pip install giskard ; endpoint via GISKARD_TARGET_URL
  • API: Python — wraps endpoint in a REST model, calls giskard.scan(model)
  • Category: "llm"

scanners/vigil_runner.py

vigil(work)

def vigil(work: str) -> Tuple[List[dict], str]

Run Vigil YARA-based prompt injection / jailbreak scanner on Python files.

  • Requires: pip install vigil-llm (Python <3.13 only)
  • API: Python — Vigil.from_config_dict(…), scans prompt strings extracted by regex
  • Rule ID: LLM-PROMPT-INJECTION-VIGIL
  • Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

scanners/nemo_runner.py

nemo_guardrails(work)

def nemo_guardrails(work: str) -> Tuple[List[dict], str]

Static analysis of NeMo Guardrails .co config files.

  • Requires: pip install nemoguardrails (Python <3.13 only)
  • Pattern: globs **/*.co and checks for missing input rails / output rails definitions
  • Rule IDs: NEMO-MISSING-INPUT-RAIL, NEMO-MISSING-OUTPUT-RAIL
  • Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

scanners/guardrailsai_runner.py

guardrails_ai(work)

def guardrails_ai(work: str) -> Tuple[List[dict], str]

Static analysis of Guardrails AI configuration in Python source files.

  • Requires: pip install guardrails-ai (Python <3.13 only)
  • Patterns checked:
    • GUARD-ON-FAIL-NOOPon_fail="noop" disables error handling
    • GUARD-THRESHOLD-TOO-HIGH — threshold ≥ 0.95 causes near-never-trigger guards
    • GUARD-MISSING-TOXICITY — no toxicity/content guard applied
  • Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]