autoscan / docs /api /cli.md
Chris4K's picture
Initial commit v5.0.0.
5248e3b verified

API Reference β€” hf-scanner CLI

The hf-scanner CLI is built with Typer. Install it as a package entrypoint:

pip install -e .
hf-scanner --help

Or run directly from the source tree:

python cli.py --help

Global options

hf-scanner [OPTIONS] COMMAND [ARGS]...
Option Description
--help Show help and exit
--install-completion Install shell completion
--show-completion Print completion script

hf-scanner version

hf-scanner version

Print the scanner version string.

Output:

hf-scanner 4.0.0

hf-scanner list-rules

hf-scanner list-rules

List all bundled Semgrep rule packs with their category and file path.

Output: ``` Pack Category Path

Semgrep:Core security /path/to/core.yaml Semgrep:Web security /path/to/web.yaml ...


---

## `hf-scanner self-test`

```bash
hf-scanner self-test

Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing.

Output: ``` Tool Status Description

semgrep βœ“ ok Static analysis (Python, JS, …) bandit βœ“ ok Python security linter detect-secrets βœ“ ok Secret detection pip-audit βœ“ ok Dependency CVE scanner ruff βœ“ ok Fast Python linter (perf rules) gitleaks βœ— MISSING Git history secret scanner hadolint βœ— MISSING Dockerfile linter agent-audit βœ— MISSING OWASP Agentic Top 10 scanner

[bootstrap] gitleaks=ok, hadolint=ok


**Exit codes:** `0` = all tools available, `2` = some tools missing.

---

## `hf-scanner scan`

```bash
hf-scanner scan TARGET [OPTIONS]

Scan a repository or local directory for security and performance issues.

Arguments

Argument Description
TARGET HTTPS URL (HF Space or git repo) or absolute/relative local path

Options

Output

Option Default Description
--format, -f both Output format: html | sarif | json | both
--out, -o temp dir Output directory or file stem (without extension)
--quiet β€” Suppress all output except findings count
--verbose β€” Show per-finding details on stdout

Output paths when --out results/scan is used:

Format Path
html results/scan.html
sarif results/scan.sarif
json results/scan.json
both results/scan.html + results/scan.sarif

Scan scope

Option Default Description
--security / --no-security True Enable/disable security scanners
--llm / --no-llm True Enable/disable LLM/agent scanners
--performance / --no-performance True Enable/disable performance scanners
--deep-history False Full git clone + gitleaks history scan

Suppression

Option Default Description
--baseline PATH β€” JSON baseline file; suppress known findings
--create-baseline PATH β€” Save current fingerprints to JSON baseline
--ignore-file PATH β€” .hfscanignore-style suppression file

Threshold and exit

Option Default Description
--severity-threshold WARNING Minimum severity for non-zero exit (ERROR | WARNING | INFO)

Authentication

Option Default Description
--hf-token $HF_TOKEN env Hugging Face Bearer token for private repos

Exit codes

Code Meaning
0 Clean β€” no findings at or above --severity-threshold
1 Findings found at or above threshold
2 Runtime error (clone failed, file write error, …)
3 Usage error (invalid argument combination)

Examples

# Basic scan, both HTML + SARIF output
hf-scanner scan ./my-project

# HF Space, SARIF only, fail only on ERROR
hf-scanner scan https://huggingface.co/spaces/org/app \
  --format sarif --out ./ci/results/scan \
  --severity-threshold ERROR

# Security-only, no LLM rules, output to specific dir
hf-scanner scan . --no-llm --no-performance \
  --format html --out reports/security-only

# Create a baseline to suppress existing issues
hf-scanner scan . --create-baseline .scan-baseline.json

# Subsequent run β€” only new findings cause failure
hf-scanner scan . --baseline .scan-baseline.json

# Full history scan for committed secrets
hf-scanner scan . --deep-history --no-performance --no-llm

# Suppress known-safe paths via ignore file
hf-scanner scan . --ignore-file .hfscanignore

# JSON output to stdout for custom processing
hf-scanner scan . --format json | jq '.[] | select(.severity == "ERROR")'

Programmatic use from Python

The CLI command functions are importable, but for programmatic scanning prefer core.scanner.scan_repo() directly:

from core.scanner import scan_repo
from report import generate_html_report, generate_sarif
import json
from pathlib import Path

findings, log = scan_repo(
    "./my-project",
    run_llm=False,
    progress_cb=lambda f, d: print(f"{f:.0%} {d}"),
)

print(log[0])   # "OK (42 unique findings)"

Path("report.html").write_text(
    generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8"
)
Path("report.sarif").write_text(
    json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8"
)