File size: 6,100 Bytes
5248e3b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
# API Reference β€” `hf-scanner` CLI

The `hf-scanner` CLI is built with [Typer](https://typer.tiangolo.com/). Install it as a package entrypoint:

```bash

pip install -e .

hf-scanner --help

```

Or run directly from the source tree:

```bash

python cli.py --help

```

---

## Global options

```

hf-scanner [OPTIONS] COMMAND [ARGS]...

```

| Option | Description |
|--------|-------------|
| `--help` | Show help and exit |
| `--install-completion` | Install shell completion |
| `--show-completion` | Print completion script |

---

## `hf-scanner version`

```bash

hf-scanner version

```

Print the scanner version string.

**Output:**
```

hf-scanner 4.0.0

```

---

## `hf-scanner list-rules`

```bash

hf-scanner list-rules

```

List all bundled Semgrep rule packs with their category and file path.

**Output:**
```

Pack                            Category        Path

----------------------------------------------------------------------

Semgrep:Core                    security        /path/to/core.yaml

Semgrep:Web                     security        /path/to/web.yaml

...

```

---

## `hf-scanner self-test`

```bash

hf-scanner self-test

```

Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing.

**Output:**
```

Tool                Status    Description

------------------------------------------------------------

semgrep             βœ“  ok     Static analysis (Python, JS, …)

bandit              βœ“  ok     Python security linter

detect-secrets      βœ“  ok     Secret detection

pip-audit           βœ“  ok     Dependency CVE scanner

ruff                βœ“  ok     Fast Python linter (perf rules)

gitleaks            βœ—  MISSING  Git history secret scanner

hadolint            βœ—  MISSING  Dockerfile linter

agent-audit         βœ—  MISSING  OWASP Agentic Top 10 scanner



[bootstrap] gitleaks=ok,  hadolint=ok

```

**Exit codes:** `0` = all tools available, `2` = some tools missing.

---

## `hf-scanner scan`

```bash

hf-scanner scan TARGET [OPTIONS]

```

Scan a repository or local directory for security and performance issues.

### Arguments

| Argument | Description |
|----------|-------------|
| `TARGET` | HTTPS URL (HF Space or git repo) or absolute/relative local path |

### Options

#### Output

| Option | Default | Description |
|--------|---------|-------------|
| `--format`, `-f` | `both` | Output format: `html` \| `sarif` \| `json` \| `both` |
| `--out`, `-o` | temp dir | Output directory or file stem (without extension) |
| `--quiet` | β€” | Suppress all output except findings count |
| `--verbose` | β€” | Show per-finding details on stdout |

**Output paths when `--out results/scan` is used:**

| Format | Path |
|--------|------|
| `html` | `results/scan.html` |
| `sarif` | `results/scan.sarif` |
| `json` | `results/scan.json` |
| `both` | `results/scan.html` + `results/scan.sarif` |

#### Scan scope

| Option | Default | Description |
|--------|---------|-------------|
| `--security / --no-security` | `True` | Enable/disable security scanners |
| `--llm / --no-llm` | `True` | Enable/disable LLM/agent scanners |
| `--performance / --no-performance` | `True` | Enable/disable performance scanners |
| `--deep-history` | `False` | Full git clone + gitleaks history scan |

#### Suppression

| Option | Default | Description |
|--------|---------|-------------|
| `--baseline PATH` | β€” | JSON baseline file; suppress known findings |
| `--create-baseline PATH` | β€” | Save current fingerprints to JSON baseline |
| `--ignore-file PATH` | β€” | `.hfscanignore`-style suppression file |

#### Threshold and exit

| Option | Default | Description |
|--------|---------|-------------|
| `--severity-threshold` | `WARNING` | Minimum severity for non-zero exit (`ERROR` \| `WARNING` \| `INFO`) |

#### Authentication

| Option | Default | Description |
|--------|---------|-------------|
| `--hf-token` | `$HF_TOKEN` env | Hugging Face Bearer token for private repos |

### Exit codes

| Code | Meaning |
|------|---------|
| `0` | Clean β€” no findings at or above `--severity-threshold` |
| `1` | Findings found at or above threshold |
| `2` | Runtime error (clone failed, file write error, …) |
| `3` | Usage error (invalid argument combination) |

### Examples

```bash

# Basic scan, both HTML + SARIF output

hf-scanner scan ./my-project



# HF Space, SARIF only, fail only on ERROR

hf-scanner scan https://huggingface.co/spaces/org/app \

  --format sarif --out ./ci/results/scan \

  --severity-threshold ERROR



# Security-only, no LLM rules, output to specific dir

hf-scanner scan . --no-llm --no-performance \

  --format html --out reports/security-only



# Create a baseline to suppress existing issues

hf-scanner scan . --create-baseline .scan-baseline.json



# Subsequent run β€” only new findings cause failure

hf-scanner scan . --baseline .scan-baseline.json



# Full history scan for committed secrets

hf-scanner scan . --deep-history --no-performance --no-llm



# Suppress known-safe paths via ignore file

hf-scanner scan . --ignore-file .hfscanignore



# JSON output to stdout for custom processing

hf-scanner scan . --format json | jq '.[] | select(.severity == "ERROR")'

```

---

## Programmatic use from Python

The CLI command functions are importable, but for programmatic scanning prefer `core.scanner.scan_repo()` directly:

```python

from core.scanner import scan_repo

from report import generate_html_report, generate_sarif

import json

from pathlib import Path



findings, log = scan_repo(

    "./my-project",

    run_llm=False,

    progress_cb=lambda f, d: print(f"{f:.0%} {d}"),

)



print(log[0])   # "OK (42 unique findings)"



Path("report.html").write_text(

    generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8"

)

Path("report.sarif").write_text(

    json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8"

)

```