# API Reference — `hf-scanner` CLI

The `hf-scanner` CLI is built with [Typer](https://typer.tiangolo.com/). Install it as a package entrypoint:

```bash
pip install -e .
hf-scanner --help
```

Or run directly from the source tree:

```bash
python cli.py --help
```

---

## Global options

```
hf-scanner [OPTIONS] COMMAND [ARGS]...
```

| Option | Description |
|--------|-------------|
| `--help` | Show help and exit |
| `--install-completion` | Install shell completion |
| `--show-completion` | Print completion script |

---

## `hf-scanner version`

```bash
hf-scanner version
```

Print the scanner version string.

**Output:**
```
hf-scanner 4.0.0
```

---

## `hf-scanner list-rules`

```bash
hf-scanner list-rules
```

List all bundled Semgrep rule packs with their category and file path.

**Output:**
```
Pack                            Category        Path
----------------------------------------------------------------------
Semgrep:Core                    security        /path/to/core.yaml
Semgrep:Web                     security        /path/to/web.yaml
...
```

---

## `hf-scanner self-test`

```bash
hf-scanner self-test
```

Check that all external tools are available on PATH. Auto-downloads gitleaks and hadolint binaries if missing.

**Output:**
```
Tool                Status    Description
------------------------------------------------------------
semgrep             ✓  ok     Static analysis (Python, JS, …)
bandit              ✓  ok     Python security linter
detect-secrets      ✓  ok     Secret detection
pip-audit           ✓  ok     Dependency CVE scanner
ruff                ✓  ok     Fast Python linter (perf rules)
gitleaks            ✗  MISSING  Git history secret scanner
hadolint            ✗  MISSING  Dockerfile linter
agent-audit         ✗  MISSING  OWASP Agentic Top 10 scanner

[bootstrap] gitleaks=ok,  hadolint=ok
```

**Exit codes:** `0` = all tools available, `2` = some tools missing.

---

## `hf-scanner scan`

```bash
hf-scanner scan TARGET [OPTIONS]
```

Scan a repository or local directory for security and performance issues.

### Arguments

| Argument | Description |
|----------|-------------|
| `TARGET` | HTTPS URL (HF Space or git repo) or absolute/relative local path |

### Options

#### Output

| Option | Default | Description |
|--------|---------|-------------|
| `--format`, `-f` | `both` | Output format: `html` \| `sarif` \| `json` \| `both` |
| `--out`, `-o` | temp dir | Output directory or file stem (without extension) |
| `--quiet` | — | Suppress all output except findings count |
| `--verbose` | — | Show per-finding details on stdout |

**Output paths when `--out results/scan` is used:**

| Format | Path |
|--------|------|
| `html` | `results/scan.html` |
| `sarif` | `results/scan.sarif` |
| `json` | `results/scan.json` |
| `both` | `results/scan.html` + `results/scan.sarif` |

#### Scan scope

| Option | Default | Description |
|--------|---------|-------------|
| `--security / --no-security` | `True` | Enable/disable security scanners |
| `--llm / --no-llm` | `True` | Enable/disable LLM/agent scanners |
| `--performance / --no-performance` | `True` | Enable/disable performance scanners |
| `--deep-history` | `False` | Full git clone + gitleaks history scan |

#### Suppression

| Option | Default | Description |
|--------|---------|-------------|
| `--baseline PATH` | — | JSON baseline file; suppress known findings |
| `--create-baseline PATH` | — | Save current fingerprints to JSON baseline |
| `--ignore-file PATH` | — | `.hfscanignore`-style suppression file |

#### Threshold and exit

| Option | Default | Description |
|--------|---------|-------------|
| `--severity-threshold` | `WARNING` | Minimum severity for non-zero exit (`ERROR` \| `WARNING` \| `INFO`) |

#### Authentication

| Option | Default | Description |
|--------|---------|-------------|
| `--hf-token` | `$HF_TOKEN` env | Hugging Face Bearer token for private repos |

### Exit codes

| Code | Meaning |
|------|---------|
| `0` | Clean — no findings at or above `--severity-threshold` |
| `1` | Findings found at or above threshold |
| `2` | Runtime error (clone failed, file write error, …) |
| `3` | Usage error (invalid argument combination) |

### Examples

```bash
# Basic scan, both HTML + SARIF output
hf-scanner scan ./my-project

# HF Space, SARIF only, fail only on ERROR
hf-scanner scan https://huggingface.co/spaces/org/app \
  --format sarif --out ./ci/results/scan \
  --severity-threshold ERROR

# Security-only, no LLM rules, output to specific dir
hf-scanner scan . --no-llm --no-performance \
  --format html --out reports/security-only

# Create a baseline to suppress existing issues
hf-scanner scan . --create-baseline .scan-baseline.json

# Subsequent run — only new findings cause failure
hf-scanner scan . --baseline .scan-baseline.json

# Full history scan for committed secrets
hf-scanner scan . --deep-history --no-performance --no-llm

# Suppress known-safe paths via ignore file
hf-scanner scan . --ignore-file .hfscanignore

# JSON output to stdout for custom processing
hf-scanner scan . --format json | jq '.[] | select(.severity == "ERROR")'
```

---

## Programmatic use from Python

The CLI command functions are importable, but for programmatic scanning prefer `core.scanner.scan_repo()` directly:

```python
from core.scanner import scan_repo
from report import generate_html_report, generate_sarif
import json
from pathlib import Path

findings, log = scan_repo(
    "./my-project",
    run_llm=False,
    progress_cb=lambda f, d: print(f"{f:.0%} {d}"),
)

print(log[0])   # "OK (42 unique findings)"

Path("report.html").write_text(
    generate_html_report(findings, {"title": "My Scan"}), encoding="utf-8"
)
Path("report.sarif").write_text(
    json.dumps(generate_sarif(findings, {}), indent=2), encoding="utf-8"
)
```