Spaces:

Chris4K
/

autoscan

Running

App Files Files Community

autoscan / docs /api /scanners.md

Chris4K

Upload 384 files

a2a5bfd verified 9 days ago

preview code

raw

history blame contribute delete

19.1 kB

API Reference — `scanners/`

`scanners/init.py` — public exports

from scanners import (
    agent_audit,
    bandit,
    detect_secrets,
    forbidden_files,
    gitleaks,
    hadolint,
    pip_audit,
    ruff_perf,
    semgrep_pack,
)

Common return types

Most runners return a 2-tuple:

Tuple[List[dict], str]   # (findings, log_message)

semgrep_pack is the exception — it returns List[dict] directly (no log message) because it is called once per rule pack and the caller aggregates messages.

All finding dicts conform to the schema defined in core/models.py.

`scanners/bandit_runner.py`

`bandit(work)`

def bandit(work: str) -> Tuple[List[dict], str]

Run Bandit Python security linter against the work directory.

Tool: bandit -r <work> -f json -q --exclude .git,.venv,...
Timeout: 300 s
Confidence mapping: HIGH/MEDIUM severity → "likely", otherwise → "possible".
OWASP: ["UNMAPPED"] (Bandit rule IDs map across multiple categories; prefer reviewing rule).
Skips: .git, .venv, venv, env, node_modules, __pycache__.

`scanners/semgrep_runner.py`

`semgrep_pack(rules_path, work, tool_label, category)`

def semgrep_pack(
    rules_path: Path,
    work: str,
    tool_label: str,
    category: str = "security",
) -> List[dict]

Run Semgrep with a specific YAML rule pack. Returns findings list directly (no log message).

Tool: semgrep --config <rules_path> --json --quiet --metrics=off --no-git-ignore <work>
Timeout: 600 s
tool_label: Used as the tool field in finding dicts (e.g., "Semgrep:Core").
Metadata extracted: owasp, confidence from Semgrep result extra.metadata.

`detect_secrets(work)`

def detect_secrets(work: str) -> Tuple[List[dict], str]

Run detect-secrets against all files in work.

Tool: detect-secrets scan --all-files <work>
Timeout: 300 s
Severity: always "ERROR".
Confidence: "possible" for entropy/high-entropy types, otherwise "likely".
OWASP: ["A02:2021-Cryptographic_Failures"].
Note: Requires a git-initialized directory for best results.

`scanners/ruff_runner.py`

`ruff_perf(work)`

def ruff_perf(work: str) -> Tuple[List[dict], str]

Run Ruff with PERF rule selection.

Tool: ruff check --select PERF --output-format json <work>
Timeout: 120 s
Severity: always "WARNING" (performance suggestions are non-critical).
Confidence: "likely" (Ruff PERF rules have very few false positives).
OWASP: ["UNMAPPED"] (performance — not a security concern).
Category: "performance".

`scanners/pip_audit_runner.py`

`pip_audit(work)`

def pip_audit(work: str) -> Tuple[List[dict], str]

Scan all requirements*.txt files in work for known CVEs using pip-audit.

Tool: pip-audit -r <req_file> -f json --strict --progress-spinner off
Timeout: 90 s per requirements file
Severity: always "ERROR" (any known CVE is critical).
Confidence: "confirmed" (CVE database entries are verified).
OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"].
Remediation: auto-populated from fix_versions field.
Skips: files under .git/.

`scanners/gitleaks_runner.py`

`gitleaks(work)`

def gitleaks(work: str) -> Tuple[List[dict], str]

Run gitleaks to detect secrets committed to git history.

Requires: gitleaks binary on PATH (auto-downloaded by core.bootstrap).
Tool: gitleaks detect --source <work> --report-path <tmp.json> --report-format json --no-banner --exit-code 0
Timeout: 600 s
Severity: always "ERROR".
Confidence: "confirmed".
OWASP: ["A02:2021-Cryptographic_Failures"].
Note: Only added to the task list when deep_history=True in scan_repo().

`scanners/hadolint_runner.py`

`hadolint(work)`

def hadolint(work: str) -> Tuple[List[dict], str]

Run hadolint against all Dockerfile* files found recursively in work.

Requires: hadolint binary on PATH (auto-downloaded by core.bootstrap).
Tool: hadolint -f json <Dockerfile>
Timeout: 60 s per Dockerfile
Severity: from hadolint's level field (uppercased).
Confidence: "likely" (TOOL_DEFAULT_CONFIDENCE).
OWASP: ["A05:2021-Security_Misconfiguration"].
Skips: Dockerfiles under .git/.

`scanners/forbidden_files.py`

`forbidden_files(work)`

def forbidden_files(work: str) -> Tuple[List[dict], str]

Walk work and flag any file whose basename matches the FORBIDDEN_FILES list in core/models.py.

No external tool required — pure Python.
Severity: "ERROR".
Confidence: "confirmed".
OWASP: ["A02:2021-Cryptographic_Failures"].

Detected file names include: .env, .env.local, .env.production, id_rsa, id_dsa, id_ecdsa, id_ed25519, .git-credentials, .npmrc, .pypirc, credentials, credentials.json, service-account.json, serviceAccountKey.json, wp-config.php.

`scanners/agent_audit_runner.py`

`agent_audit(work)`

def agent_audit(work: str) -> Tuple[List[dict], str]

Run agent-audit for OWASP Agentic Top 10 (2026) scanning of AI agent code.

Requires: agent-audit binary on PATH (pip install agent-audit).
Tool: agent-audit scan <work> --format json
Timeout: 300 s
Severity mapping: critical → ERROR, high → HIGH, medium → WARNING, low → INFO.
Confidence mapping: score ≥ 0.9 → "confirmed", ≥ 0.7 → "likely", else → "possible".
OWASP: uses asi_categories field from agent-audit output (e.g., ["LLM01"]).
Suppressed findings: automatically excluded.

Internal helpers

`_confidence(score)`

def _confidence(score: float) -> str

Map agent-audit float confidence (0–1) to "confirmed" / "likely" / "possible".

`_severity(s)`

def _severity(s: str) -> str

Normalize agent-audit severity string to scanner convention ("ERROR", "HIGH", "WARNING", "INFO").

New runners — Sprint 6 (Tasks 01–23)

`scanners/modelscan_runner.py`

`modelscan(work)`

def modelscan(work: str) -> Tuple[List[dict], str]

Run ModelScan (Palo Alto) to detect serialisation RCE in ML model files (.pkl, .pt, .h5, .onnx, .npy).

Requires: pip install modelscan (Python <3.13 only)
API: Python — modelscan.scanner.ModelScan().scan(path=work)
Severity: "CRITICAL"/"HIGH" → "ERROR", "MEDIUM" → "WARNING", else "INFO"
OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]
Category: "security"

`scanners/picklescan_runner.py`

`picklescan(work)`

def picklescan(work: str) -> Tuple[List[dict], str]

Run PickleScan to detect unsafe opcodes in pickle and PyTorch model files.

Requires: pip install picklescan
API: Python — picklescan.scanner.scan_file_path(work)
Rule IDs: PickleUnsafeOpcodesScanner, PickleSafetyScanner
OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]
Category: "security"

`scanners/fickling_runner.py`

`fickling(work)`

def fickling(work: str) -> Tuple[List[dict], str]

Run Fickling (Trail of Bits) allowlist scanner on all pickle-format files.

Requires: pip install fickling
API: Python — fickling.analysis.run_checks(pkl)
Rule IDs: FICKLING-DANGEROUS-IMPORTS, FICKLING-ARBITRARY-CODE
OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]

`scanners/trivy_runner.py`

`trivy(work)`

def trivy(work: str) -> Tuple[List[dict], str]

Run Trivy (Aqua Security) for CVE, secret, and IaC scanning.

Requires: trivy binary on PATH (auto-downloaded by core.bootstrap)
Tool: trivy fs <work> --format json --output <tmpfile> --quiet
Timeout: 600 s
Output: reads JSON from tempfile (avoids mixing with tool output on stdout)
Severity: Trivy's Severity field mapped to scanner convention
OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

`scanners/trufflehog_runner.py`

`trufflehog(work, deep_history=False)`

def trufflehog(work: str, deep_history: bool = False) -> Tuple[List[dict], str]

Run TruffleHog for verified secret detection in git history and filesystem.

Requires: trufflehog binary on PATH (auto-downloaded by core.bootstrap)
Tool: trufflehog filesystem <work> --json --only-verified
Output: JSONL (one JSON object per line)
Helper: _parse_cvss_base_score() from CVSSv3 string
OWASP: ["A02:2021-Cryptographic_Failures"]
Category: "security"

`scanners/osv_runner.py`

`osv_scanner(work)`

def osv_scanner(work: str) -> Tuple[List[dict], str]

Run OSV-Scanner (Google) for dependency CVE scanning.

Requires: osv-scanner binary on PATH
Tool: osv-scanner scan dir:<work> --format json
OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

`scanners/checkov_runner.py`

`checkov(work)`

def checkov(work: str) -> Tuple[List[dict], str]

Run Checkov (Bridgecrew) for IaC misconfigurations, GitHub Actions, Terraform.

Requires: pip install checkov
Tool: checkov -d <work> --output json --compact --quiet
Output format: handles both list-of-result-blocks and single result dict
_CRITICAL_IDS: frozenset of known high-risk Checkov rule IDs that escalate to "ERROR"
OWASP: ["A05:2021-Security_Misconfiguration"]

`scanners/grype_runner.py`

`grype(work)`

def grype(work: str) -> Tuple[List[dict], str]

Run Grype (Anchore) for container/package CVE scanning via a syft→grype pipeline.

Requires: syft and grype binaries on PATH (both auto-downloaded by core.bootstrap)
Tool: syft <work> -o json | grype --add-cpes-if-none -o json --file <tmpfile>
Output: reads from tempfile (syft pipes stdout to grype, grype writes to file)
Severity: Grype's capitalized severity string → scanner convention
OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

`scanners/socket_runner.py`

`socket_scanner(work)`

def socket_scanner(work: str) -> Tuple[List[dict], str]

Run Socket CLI for supply-chain security (malicious packages, typosquatting).

Requires: socket binary on PATH (npm install -g @socketsecurity/cli)
Tool: socket scan <work> --json
Category: "supply-chain"
OWASP: ["A08:2021-Software_and_Data_Integrity_Failures"]

`scanners/safety_runner.py`

`safety_check(work)`

def safety_check(work: str) -> Tuple[List[dict], str]

Run Safety for Python dependency CVE checking.

Requires: pip install safety
Tool: safety check --json
Output format: handles Safety v2 list format and v3 dict format
OWASP: ["A06:2021-Vulnerable_and_Outdated_Components"]

`scanners/llmguard_runner.py`

`llm_guard(work)`

def llm_guard(work: str) -> Tuple[List[dict], str]

Run LLM Guard PromptInjection scanner on Python files.

Requires: pip install llm-guard (Python <3.13 only)
API: Python — extracts prompt strings via regex, scans with llm_guard.input_scanners.PromptInjection
Rule ID: LLM-PROMPT-INJECTION
OWASP: ["LLM01:2025-Prompt_Injection"]
Category: "llm"

`scanners/garak_runner.py`

`garak(target_url, probes=None)`

def garak(target_url: str, probes: list[str] | None = None) -> Tuple[List[dict], str]

Run Garak LLM vulnerability probes against a live endpoint.

Requires: pip install garak ; endpoint URL via GARAK_TARGET_URL env var
Tool: garak --model_type rest --model_name <url> --probes <probes> --report_prefix <tmpdir>
Output: parses .report.jsonl files from garak's output directory
Default probes: "dan,knownbadsignatures,packagehallucination"
Category: "llm"; OWASP: ["LLM01:2025-Prompt_Injection"]

`scanners/deepteam_runner.py`

`deepteam(target_url, target_purpose="")`

def deepteam(target_url: str, target_purpose: str = "") -> Tuple[List[dict], str]

Run DeepTeam (Confident AI) red-team evaluation.

Requires: pip install deepeval ; endpoint via DEEPTEAM_TARGET_URL
Approach: writes a YAML config and runs deepteam run --config <yaml>
Category: "llm" ; OWASP: ["LLM02:2025-Sensitive_Information_Disclosure"]

`scanners/promptfoo_runner.py`

`promptfoo(target_url, target_purpose="", plugins=None)`

def promptfoo(target_url: str, target_purpose: str = "", plugins: list[str] | None = None) -> Tuple[List[dict], str]

Run Promptfoo red-team evaluation via npx.

Requires: Node.js + npx promptfoo ; endpoint via PROMPTFOO_TARGET_URL
Tool: npx promptfoo@latest redteam run --config <yaml> --output <json>
Default plugins: ["harmful", "injection", "jailbreak", "pii"]
Category: "llm"

`scanners/azure_redteam_runner.py`

`azure_redteam(target_url, risk_categories=None, attack_strategies=None)`

def azure_redteam(
    target_url: str,
    risk_categories: list[str] | None = None,
    attack_strategies: list[str] | None = None,
) -> Tuple[List[dict], str]

Run Azure AI Evaluation red-team pipeline.

Requires: pip install azure-ai-evaluation ; endpoint via AZURE_REDTEAM_TARGET_URL
API: azure.ai.evaluation.red_team.RedTeamOrchestrator in local mode via asyncio.run()
Default risk categories: ["violence", "sexual", "hate_unfairness", "self_harm"]
Category: "llm"

`scanners/pyrit_runner.py`

`pyrit(target_url, objectives=None)`

def pyrit(target_url: str, objectives: list[str] | None = None) -> Tuple[List[dict], str]

Run PyRIT (Microsoft) crescendo multi-turn attack orchestration.

Requires: pip install pyrit ; endpoint via PYRIT_TARGET_URL
Binary guard: have_binary("pyrit_scan") (optional CLI wrapper)
Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

`scanners/augustus_runner.py`

`augustus(target_url, probes=None)`

def augustus(target_url: str, probes: list[str] | None = None) -> Tuple[List[dict], str]

Run Augustus Go-based LLM red-team scanner.

Requires: augustus binary on PATH ; endpoint via AUGUSTUS_TARGET_URL
Tool: augustus --target <url> --probes <probes> --output json
Category: "llm"

`scanners/fuzzyai_runner.py`

`fuzzyai(target_url, attacks=None)`

def fuzzyai(target_url: str, attacks: list[str] | None = None) -> Tuple[List[dict], str]

Run FuzzyAI adversarial fuzzing framework.

Requires: pip install fuzzy-ai ; endpoint via FUZZYAI_TARGET_URL
API: Python — import fuzzy_ai
Category: "llm"

`scanners/giskard_runner.py`

`giskard_scan(target_url, model_description="")`

def giskard_scan(target_url: str, model_description: str = "") -> Tuple[List[dict], str]

Run Giskard model testing and red-teaming.

Requires: pip install giskard ; endpoint via GISKARD_TARGET_URL
API: Python — wraps endpoint in a REST model, calls giskard.scan(model)
Category: "llm"

`scanners/vigil_runner.py`

`vigil(work)`

def vigil(work: str) -> Tuple[List[dict], str]

Run Vigil YARA-based prompt injection / jailbreak scanner on Python files.

Requires: pip install vigil-llm (Python <3.13 only)
API: Python — Vigil.from_config_dict(…), scans prompt strings extracted by regex
Rule ID: LLM-PROMPT-INJECTION-VIGIL
Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

`scanners/nemo_runner.py`

`nemo_guardrails(work)`

def nemo_guardrails(work: str) -> Tuple[List[dict], str]

Static analysis of NeMo Guardrails .co config files.

Requires: pip install nemoguardrails (Python <3.13 only)
Pattern: globs **/*.co and checks for missing input rails / output rails definitions
Rule IDs: NEMO-MISSING-INPUT-RAIL, NEMO-MISSING-OUTPUT-RAIL
Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

`scanners/guardrailsai_runner.py`

`guardrails_ai(work)`

def guardrails_ai(work: str) -> Tuple[List[dict], str]

Static analysis of Guardrails AI configuration in Python source files.

Requires: pip install guardrails-ai (Python <3.13 only)
Patterns checked:
- GUARD-ON-FAIL-NOOP — on_fail="noop" disables error handling
- GUARD-THRESHOLD-TOO-HIGH — threshold ≥ 0.95 causes near-never-trigger guards
- GUARD-MISSING-TOXICITY — no toxicity/content guard applied
Category: "llm" ; OWASP: ["LLM01:2025-Prompt_Injection"]

API Reference — scanners/

scanners/__init__.py — public exports

Common return types

scanners/bandit_runner.py

bandit(work)

scanners/semgrep_runner.py

semgrep_pack(rules_path, work, tool_label, category)

detect_secrets(work)

scanners/ruff_runner.py

ruff_perf(work)

scanners/pip_audit_runner.py

pip_audit(work)

scanners/gitleaks_runner.py

gitleaks(work)

scanners/hadolint_runner.py

hadolint(work)

scanners/forbidden_files.py

forbidden_files(work)

scanners/agent_audit_runner.py

agent_audit(work)

Internal helpers

_confidence(score)

_severity(s)

New runners — Sprint 6 (Tasks 01–23)

scanners/modelscan_runner.py

modelscan(work)

scanners/picklescan_runner.py

picklescan(work)

scanners/fickling_runner.py

fickling(work)

scanners/trivy_runner.py

trivy(work)

scanners/trufflehog_runner.py

trufflehog(work, deep_history=False)

scanners/osv_runner.py

osv_scanner(work)

scanners/checkov_runner.py

checkov(work)

scanners/grype_runner.py

grype(work)

scanners/socket_runner.py

socket_scanner(work)

scanners/safety_runner.py

safety_check(work)

scanners/llmguard_runner.py

llm_guard(work)

scanners/garak_runner.py

garak(target_url, probes=None)

scanners/deepteam_runner.py

deepteam(target_url, target_purpose="")

scanners/promptfoo_runner.py

promptfoo(target_url, target_purpose="", plugins=None)

scanners/azure_redteam_runner.py

azure_redteam(target_url, risk_categories=None, attack_strategies=None)

scanners/pyrit_runner.py

pyrit(target_url, objectives=None)

scanners/augustus_runner.py

augustus(target_url, probes=None)

scanners/fuzzyai_runner.py

fuzzyai(target_url, attacks=None)

scanners/giskard_runner.py

giskard_scan(target_url, model_description="")

scanners/vigil_runner.py

vigil(work)

scanners/nemo_runner.py

nemo_guardrails(work)

scanners/guardrailsai_runner.py

guardrails_ai(work)