File size: 6,100 Bytes
5248e3b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | # API Reference β `hf-scanner` CLI
The `hf-scanner` CLI is built with [Typer](https://typer.tiangolo.com/). Install it as a package entrypoint:
```bash
pip install -e .
hf-scanner --help
```
Or run directly from the source tree:
```bash
python cli.py --help
```
---
## Global options
```
hf-scanner [OPTIONS] COMMAND [ARGS]...
```
| Option | Description |
|--------|-------------|
| `--help` | Show help and exit |
| `--install-completion` | Install shell completion |
| `--show-completion` | Print completion script |
---
## `hf-scanner version`
```bash
hf-scanner version
```
Print the scanner version string.
**Output:**
```
hf-scanner 4.0.0
```
---
## `hf-scanner list-rules`
```bash
hf-scanner list-rules
```
List all bundled Semgrep rule packs with their category and file path.
**Output:**
```
Pack Category Path
----------------------------------------------------------------------
Semgrep:Core security /path/to/core.yaml
Semgrep:Web security /path/to/web.yaml
...
```
---
## `hf-scanner self-test`
```bash
hf-scanner self-test
```
Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing.
**Output:**
```
Tool Status Description
------------------------------------------------------------
semgrep β ok Static analysis (Python, JS, β¦)
bandit β ok Python security linter
detect-secrets β ok Secret detection
pip-audit β ok Dependency CVE scanner
ruff β ok Fast Python linter (perf rules)
gitleaks β MISSING Git history secret scanner
hadolint β MISSING Dockerfile linter
agent-audit β MISSING OWASP Agentic Top 10 scanner
[bootstrap] gitleaks=ok, hadolint=ok
```
**Exit codes:** `0` = all tools available, `2` = some tools missing.
---
## `hf-scanner scan`
```bash
hf-scanner scan TARGET [OPTIONS]
```
Scan a repository or local directory for security and performance issues.
### Arguments
| Argument | Description |
|----------|-------------|
| `TARGET` | HTTPS URL (HF Space or git repo) or absolute/relative local path |
### Options
#### Output
| Option | Default | Description |
|--------|---------|-------------|
| `--format`, `-f` | `both` | Output format: `html` \| `sarif` \| `json` \| `both` |
| `--out`, `-o` | temp dir | Output directory or file stem (without extension) |
| `--quiet` | β | Suppress all output except findings count |
| `--verbose` | β | Show per-finding details on stdout |
**Output paths when `--out results/scan` is used:**
| Format | Path |
|--------|------|
| `html` | `results/scan.html` |
| `sarif` | `results/scan.sarif` |
| `json` | `results/scan.json` |
| `both` | `results/scan.html` + `results/scan.sarif` |
#### Scan scope
| Option | Default | Description |
|--------|---------|-------------|
| `--security / --no-security` | `True` | Enable/disable security scanners |
| `--llm / --no-llm` | `True` | Enable/disable LLM/agent scanners |
| `--performance / --no-performance` | `True` | Enable/disable performance scanners |
| `--deep-history` | `False` | Full git clone + gitleaks history scan |
#### Suppression
| Option | Default | Description |
|--------|---------|-------------|
| `--baseline PATH` | β | JSON baseline file; suppress known findings |
| `--create-baseline PATH` | β | Save current fingerprints to JSON baseline |
| `--ignore-file PATH` | β | `.hfscanignore`-style suppression file |
#### Threshold and exit
| Option | Default | Description |
|--------|---------|-------------|
| `--severity-threshold` | `WARNING` | Minimum severity for non-zero exit (`ERROR` \| `WARNING` \| `INFO`) |
#### Authentication
| Option | Default | Description |
|--------|---------|-------------|
| `--hf-token` | `$HF_TOKEN` env | Hugging Face Bearer token for private repos |
### Exit codes
| Code | Meaning |
|------|---------|
| `0` | Clean β no findings at or above `--severity-threshold` |
| `1` | Findings found at or above threshold |
| `2` | Runtime error (clone failed, file write error, β¦) |
| `3` | Usage error (invalid argument combination) |
### Examples
```bash
# Basic scan, both HTML + SARIF output
hf-scanner scan ./my-project
# HF Space, SARIF only, fail only on ERROR
hf-scanner scan https://huggingface.co/spaces/org/app \
--format sarif --out ./ci/results/scan \
--severity-threshold ERROR
# Security-only, no LLM rules, output to specific dir
hf-scanner scan . --no-llm --no-performance \
--format html --out reports/security-only
# Create a baseline to suppress existing issues
hf-scanner scan . --create-baseline .scan-baseline.json
# Subsequent run β only new findings cause failure
hf-scanner scan . --baseline .scan-baseline.json
# Full history scan for committed secrets
hf-scanner scan . --deep-history --no-performance --no-llm
# Suppress known-safe paths via ignore file
hf-scanner scan . --ignore-file .hfscanignore
# JSON output to stdout for custom processing
hf-scanner scan . --format json | jq '.[] | select(.severity == "ERROR")'
```
---
## Programmatic use from Python
The CLI command functions are importable, but for programmatic scanning prefer `core.scanner.scan_repo()` directly:
```python
from core.scanner import scan_repo
from report import generate_html_report, generate_sarif
import json
from pathlib import Path
findings, log = scan_repo(
"./my-project",
run_llm=False,
progress_cb=lambda f, d: print(f"{f:.0%} {d}"),
)
print(log[0]) # "OK (42 unique findings)"
Path("report.html").write_text(
generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8"
)
Path("report.sarif").write_text(
json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8"
)
```
|