# API Reference — `hf-scanner` CLI The `hf-scanner` CLI is built with [Typer](https://typer.tiangolo.com/). Install it as a package entrypoint: ```bash pip install -e . hf-scanner --help ``` Or run directly from the source tree: ```bash python cli.py --help ``` --- ## Global options ``` hf-scanner [OPTIONS] COMMAND [ARGS]... ``` | Option | Description | |--------|-------------| | `--help` | Show help and exit | | `--install-completion` | Install shell completion | | `--show-completion` | Print completion script | --- ## `hf-scanner version` ```bash hf-scanner version ``` Print the scanner version string. **Output:** ``` hf-scanner 4.0.0 ``` --- ## `hf-scanner list-rules` ```bash hf-scanner list-rules ``` List all bundled Semgrep rule packs with their category and file path. **Output:** ``` Pack Category Path ---------------------------------------------------------------------- Semgrep:Core security /path/to/core.yaml Semgrep:Web security /path/to/web.yaml ... ``` --- ## `hf-scanner self-test` ```bash hf-scanner self-test ``` Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing. **Output:** ``` Tool Status Description ------------------------------------------------------------ semgrep ✓ ok Static analysis (Python, JS, …) bandit ✓ ok Python security linter detect-secrets ✓ ok Secret detection pip-audit ✓ ok Dependency CVE scanner ruff ✓ ok Fast Python linter (perf rules) gitleaks ✗ MISSING Git history secret scanner hadolint ✗ MISSING Dockerfile linter agent-audit ✗ MISSING OWASP Agentic Top 10 scanner [bootstrap] gitleaks=ok, hadolint=ok ``` **Exit codes:** `0` = all tools available, `2` = some tools missing. --- ## `hf-scanner scan` ```bash hf-scanner scan TARGET [OPTIONS] ``` Scan a repository or local directory for security and performance issues. ### Arguments | Argument | Description | |----------|-------------| | `TARGET` | HTTPS URL (HF Space or git repo) or absolute/relative local path | ### Options #### Output | Option | Default | Description | |--------|---------|-------------| | `--format`, `-f` | `both` | Output format: `html` \| `sarif` \| `json` \| `both` | | `--out`, `-o` | temp dir | Output directory or file stem (without extension) | | `--quiet` | — | Suppress all output except findings count | | `--verbose` | — | Show per-finding details on stdout | **Output paths when `--out results/scan` is used:** | Format | Path | |--------|------| | `html` | `results/scan.html` | | `sarif` | `results/scan.sarif` | | `json` | `results/scan.json` | | `both` | `results/scan.html` + `results/scan.sarif` | #### Scan scope | Option | Default | Description | |--------|---------|-------------| | `--security / --no-security` | `True` | Enable/disable security scanners | | `--llm / --no-llm` | `True` | Enable/disable LLM/agent scanners | | `--performance / --no-performance` | `True` | Enable/disable performance scanners | | `--deep-history` | `False` | Full git clone + gitleaks history scan | #### Suppression | Option | Default | Description | |--------|---------|-------------| | `--baseline PATH` | — | JSON baseline file; suppress known findings | | `--create-baseline PATH` | — | Save current fingerprints to JSON baseline | | `--ignore-file PATH` | — | `.hfscanignore`-style suppression file | #### Threshold and exit | Option | Default | Description | |--------|---------|-------------| | `--severity-threshold` | `WARNING` | Minimum severity for non-zero exit (`ERROR` \| `WARNING` \| `INFO`) | #### Authentication | Option | Default | Description | |--------|---------|-------------| | `--hf-token` | `$HF_TOKEN` env | Hugging Face Bearer token for private repos | ### Exit codes | Code | Meaning | |------|---------| | `0` | Clean — no findings at or above `--severity-threshold` | | `1` | Findings found at or above threshold | | `2` | Runtime error (clone failed, file write error, …) | | `3` | Usage error (invalid argument combination) | ### Examples ```bash # Basic scan, both HTML + SARIF output hf-scanner scan ./my-project # HF Space, SARIF only, fail only on ERROR hf-scanner scan https://huggingface.co/spaces/org/app \ --format sarif --out ./ci/results/scan \ --severity-threshold ERROR # Security-only, no LLM rules, output to specific dir hf-scanner scan . --no-llm --no-performance \ --format html --out reports/security-only # Create a baseline to suppress existing issues hf-scanner scan . --create-baseline .scan-baseline.json # Subsequent run — only new findings cause failure hf-scanner scan . --baseline .scan-baseline.json # Full history scan for committed secrets hf-scanner scan . --deep-history --no-performance --no-llm # Suppress known-safe paths via ignore file hf-scanner scan . --ignore-file .hfscanignore # JSON output to stdout for custom processing hf-scanner scan . --format json | jq '.[] | select(.severity == "ERROR")' ``` --- ## Programmatic use from Python The CLI command functions are importable, but for programmatic scanning prefer `core.scanner.scan_repo()` directly: ```python from core.scanner import scan_repo from report import generate_html_report, generate_sarif import json from pathlib import Path findings, log = scan_repo( "./my-project", run_llm=False, progress_cb=lambda f, d: print(f"{f:.0%} {d}"), ) print(log[0]) # "OK (42 unique findings)" Path("report.html").write_text( generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8" ) Path("report.sarif").write_text( json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8" ) ```