autoscan / docs /api /cli.md
Chris4K's picture
Initial commit v5.0.0.
5248e3b verified
# API Reference β€” `hf-scanner` CLI
The `hf-scanner` CLI is built with [Typer](https://typer.tiangolo.com/). Install it as a package entrypoint:
```bash
pip install -e .
hf-scanner --help
```
Or run directly from the source tree:
```bash
python cli.py --help
```
---
## Global options
```
hf-scanner [OPTIONS] COMMAND [ARGS]...
```
| Option | Description |
|--------|-------------|
| `--help` | Show help and exit |
| `--install-completion` | Install shell completion |
| `--show-completion` | Print completion script |
---
## `hf-scanner version`
```bash
hf-scanner version
```
Print the scanner version string.
**Output:**
```
hf-scanner 4.0.0
```
---
## `hf-scanner list-rules`
```bash
hf-scanner list-rules
```
List all bundled Semgrep rule packs with their category and file path.
**Output:**
```
Pack Category Path
----------------------------------------------------------------------
Semgrep:Core security /path/to/core.yaml
Semgrep:Web security /path/to/web.yaml
...
```
---
## `hf-scanner self-test`
```bash
hf-scanner self-test
```
Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing.
**Output:**
```
Tool Status Description
------------------------------------------------------------
semgrep βœ“ ok Static analysis (Python, JS, …)
bandit βœ“ ok Python security linter
detect-secrets βœ“ ok Secret detection
pip-audit βœ“ ok Dependency CVE scanner
ruff βœ“ ok Fast Python linter (perf rules)
gitleaks βœ— MISSING Git history secret scanner
hadolint βœ— MISSING Dockerfile linter
agent-audit βœ— MISSING OWASP Agentic Top 10 scanner
[bootstrap] gitleaks=ok, hadolint=ok
```
**Exit codes:** `0` = all tools available, `2` = some tools missing.
---
## `hf-scanner scan`
```bash
hf-scanner scan TARGET [OPTIONS]
```
Scan a repository or local directory for security and performance issues.
### Arguments
| Argument | Description |
|----------|-------------|
| `TARGET` | HTTPS URL (HF Space or git repo) or absolute/relative local path |
### Options
#### Output
| Option | Default | Description |
|--------|---------|-------------|
| `--format`, `-f` | `both` | Output format: `html` \| `sarif` \| `json` \| `both` |
| `--out`, `-o` | temp dir | Output directory or file stem (without extension) |
| `--quiet` | β€” | Suppress all output except findings count |
| `--verbose` | β€” | Show per-finding details on stdout |
**Output paths when `--out results/scan` is used:**
| Format | Path |
|--------|------|
| `html` | `results/scan.html` |
| `sarif` | `results/scan.sarif` |
| `json` | `results/scan.json` |
| `both` | `results/scan.html` + `results/scan.sarif` |
#### Scan scope
| Option | Default | Description |
|--------|---------|-------------|
| `--security / --no-security` | `True` | Enable/disable security scanners |
| `--llm / --no-llm` | `True` | Enable/disable LLM/agent scanners |
| `--performance / --no-performance` | `True` | Enable/disable performance scanners |
| `--deep-history` | `False` | Full git clone + gitleaks history scan |
#### Suppression
| Option | Default | Description |
|--------|---------|-------------|
| `--baseline PATH` | β€” | JSON baseline file; suppress known findings |
| `--create-baseline PATH` | β€” | Save current fingerprints to JSON baseline |
| `--ignore-file PATH` | β€” | `.hfscanignore`-style suppression file |
#### Threshold and exit
| Option | Default | Description |
|--------|---------|-------------|
| `--severity-threshold` | `WARNING` | Minimum severity for non-zero exit (`ERROR` \| `WARNING` \| `INFO`) |
#### Authentication
| Option | Default | Description |
|--------|---------|-------------|
| `--hf-token` | `$HF_TOKEN` env | Hugging Face Bearer token for private repos |
### Exit codes
| Code | Meaning |
|------|---------|
| `0` | Clean β€” no findings at or above `--severity-threshold` |
| `1` | Findings found at or above threshold |
| `2` | Runtime error (clone failed, file write error, …) |
| `3` | Usage error (invalid argument combination) |
### Examples
```bash
# Basic scan, both HTML + SARIF output
hf-scanner scan ./my-project
# HF Space, SARIF only, fail only on ERROR
hf-scanner scan https://huggingface.co/spaces/org/app \
--format sarif --out ./ci/results/scan \
--severity-threshold ERROR
# Security-only, no LLM rules, output to specific dir
hf-scanner scan . --no-llm --no-performance \
--format html --out reports/security-only
# Create a baseline to suppress existing issues
hf-scanner scan . --create-baseline .scan-baseline.json
# Subsequent run β€” only new findings cause failure
hf-scanner scan . --baseline .scan-baseline.json
# Full history scan for committed secrets
hf-scanner scan . --deep-history --no-performance --no-llm
# Suppress known-safe paths via ignore file
hf-scanner scan . --ignore-file .hfscanignore
# JSON output to stdout for custom processing
hf-scanner scan . --format json | jq '.[] | select(.severity == "ERROR")'
```
---
## Programmatic use from Python
The CLI command functions are importable, but for programmatic scanning prefer `core.scanner.scan_repo()` directly:
```python
from core.scanner import scan_repo
from report import generate_html_report, generate_sarif
import json
from pathlib import Path
findings, log = scan_repo(
"./my-project",
run_llm=False,
progress_cb=lambda f, d: print(f"{f:.0%} {d}"),
)
print(log[0]) # "OK (42 unique findings)"
Path("report.html").write_text(
generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8"
)
Path("report.sarif").write_text(
json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8"
)
```