Spaces:

Chris4K
/

autoscan

Sleeping

App Files Files Community

autoscan / docs /api /cli.md

Chris4K

Initial commit v5.0.0.

5248e3b verified 9 days ago

preview code

raw

history blame contribute delete

6.1 kB

	# API Reference — `hf-scanner` CLI

	The `hf-scanner` CLI is built with [Typer](https://typer.tiangolo.com/). Install it as a package entrypoint:

	```bash
	pip install -e .
	hf-scanner --help
	```

	Or run directly from the source tree:

	```bash
	python cli.py --help
	```

	---

	## Global options

	```
	hf-scanner [OPTIONS] COMMAND [ARGS]...
	```

	\| Option \| Description \|
	\|--------\|-------------\|
	\| `--help` \| Show help and exit \|
	\| `--install-completion` \| Install shell completion \|
	\| `--show-completion` \| Print completion script \|

	---

	## `hf-scanner version`

	```bash
	hf-scanner version
	```

	Print the scanner version string.

	Output:
	```
	hf-scanner 4.0.0
	```

	---

	## `hf-scanner list-rules`

	```bash
	hf-scanner list-rules
	```

	List all bundled Semgrep rule packs with their category and file path.

	Output:
	```
	Pack Category Path
	----------------------------------------------------------------------
	Semgrep:Core security /path/to/core.yaml
	Semgrep:Web security /path/to/web.yaml
	...
	```

	---

	## `hf-scanner self-test`

	```bash
	hf-scanner self-test
	```

	Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing.

	Output:
	```
	Tool Status Description
	------------------------------------------------------------
	semgrep ✓ ok Static analysis (Python, JS, …)
	bandit ✓ ok Python security linter
	detect-secrets ✓ ok Secret detection
	pip-audit ✓ ok Dependency CVE scanner
	ruff ✓ ok Fast Python linter (perf rules)
	gitleaks ✗ MISSING Git history secret scanner
	hadolint ✗ MISSING Dockerfile linter
	agent-audit ✗ MISSING OWASP Agentic Top 10 scanner

	[bootstrap] gitleaks=ok, hadolint=ok
	```

	Exit codes: `0` = all tools available, `2` = some tools missing.

	---

	## `hf-scanner scan`

	```bash
	hf-scanner scan TARGET [OPTIONS]
	```

	Scan a repository or local directory for security and performance issues.

	### Arguments

	\| Argument \| Description \|
	\|----------\|-------------\|
	\| `TARGET` \| HTTPS URL (HF Space or git repo) or absolute/relative local path \|

	### Options

	#### Output

	\| Option \| Default \| Description \|
	\|--------\|---------\|-------------\|
	\| `--format`, `-f` \| `both` \| Output format: `html` \\| `sarif` \\| `json` \\| `both` \|
	\| `--out`, `-o` \| temp dir \| Output directory or file stem (without extension) \|
	\| `--quiet` \| — \| Suppress all output except findings count \|
	\| `--verbose` \| — \| Show per-finding details on stdout \|

	Output paths when `--out results/scan` is used:

	\| Format \| Path \|
	\|--------\|------\|
	\| `html` \| `results/scan.html` \|
	\| `sarif` \| `results/scan.sarif` \|
	\| `json` \| `results/scan.json` \|
	\| `both` \| `results/scan.html` + `results/scan.sarif` \|

	#### Scan scope

	\| Option \| Default \| Description \|
	\|--------\|---------\|-------------\|
	\| `--security / --no-security` \| `True` \| Enable/disable security scanners \|
	\| `--llm / --no-llm` \| `True` \| Enable/disable LLM/agent scanners \|
	\| `--performance / --no-performance` \| `True` \| Enable/disable performance scanners \|
	\| `--deep-history` \| `False` \| Full git clone + gitleaks history scan \|

	#### Suppression

	\| Option \| Default \| Description \|
	\|--------\|---------\|-------------\|
	\| `--baseline PATH` \| — \| JSON baseline file; suppress known findings \|
	\| `--create-baseline PATH` \| — \| Save current fingerprints to JSON baseline \|
	\| `--ignore-file PATH` \| — \| `.hfscanignore`-style suppression file \|

	#### Threshold and exit

	\| Option \| Default \| Description \|
	\|--------\|---------\|-------------\|
	\| `--severity-threshold` \| `WARNING` \| Minimum severity for non-zero exit (`ERROR` \\| `WARNING` \\| `INFO`) \|

	#### Authentication

	\| Option \| Default \| Description \|
	\|--------\|---------\|-------------\|
	\| `--hf-token` \| `$HF_TOKEN` env \| Hugging Face Bearer token for private repos \|

	### Exit codes

	\| Code \| Meaning \|
	\|------\|---------\|
	\| `0` \| Clean — no findings at or above `--severity-threshold` \|
	\| `1` \| Findings found at or above threshold \|
	\| `2` \| Runtime error (clone failed, file write error, …) \|
	\| `3` \| Usage error (invalid argument combination) \|

	### Examples

	```bash
	# Basic scan, both HTML + SARIF output
	hf-scanner scan ./my-project

	# HF Space, SARIF only, fail only on ERROR
	hf-scanner scan https://huggingface.co/spaces/org/app \
	--format sarif --out ./ci/results/scan \
	--severity-threshold ERROR

	# Security-only, no LLM rules, output to specific dir
	hf-scanner scan . --no-llm --no-performance \
	--format html --out reports/security-only

	# Create a baseline to suppress existing issues
	hf-scanner scan . --create-baseline .scan-baseline.json

	# Subsequent run — only new findings cause failure
	hf-scanner scan . --baseline .scan-baseline.json

	# Full history scan for committed secrets
	hf-scanner scan . --deep-history --no-performance --no-llm

	# Suppress known-safe paths via ignore file
	hf-scanner scan . --ignore-file .hfscanignore

	# JSON output to stdout for custom processing
	hf-scanner scan . --format json \| jq '.[] \| select(.severity == "ERROR")'
	```

	---

	## Programmatic use from Python

	The CLI command functions are importable, but for programmatic scanning prefer `core.scanner.scan_repo()` directly:

	```python
	from core.scanner import scan_repo
	from report import generate_html_report, generate_sarif
	import json
	from pathlib import Path

	findings, log = scan_repo(
	"./my-project",
	run_llm=False,
	progress_cb=lambda f, d: print(f"{f:.0%} {d}"),
	)

	print(log[0]) # "OK (42 unique findings)"

	Path("report.html").write_text(
	generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8"
	)
	Path("report.sarif").write_text(
	json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8"
	)
	```