autoscan / docs /roadmap.md
Chris4K's picture
Upload 384 files
a2a5bfd verified
# Roadmap
This document tracks planned features, improvements, and known technical debt for **autoscan / SENTINEL**.
Items are grouped by priority tier. The "Done" section is a record of completed milestones.
---
## Done ✅ (v5.0)
| Feature | Notes |
|---------|-------|
| FastAPI web application (Sentinel) | Replaces Gradio-only UI; multi-user ready |
| HuggingFace Space discovery | Search, filter by stage / hardware / framework / MCP |
| Parallel scan execution | `ThreadPoolExecutor`, SSE live progress stream |
| Per-tool scanner selection | Individual tools selectable in Discover UI and API |
| html2pdf.js export | Client-side PDF, no server dependency |
| Notifications panel | Bell icon, mark-read, delete |
| Bootstrap binaries | Auto-download gitleaks + hadolint on startup |
| Share links | Time-limited read-only scan URLs |
| Insights page | Severity breakdown, 14-day trend, top targets |
| Knowledge Base | Searchable remediation articles |
| Schedules (APScheduler) | Cron-based automated scans |
| AI Explainer | Ollama / OpenAI per-finding annotations |
| CVE data externalization (T15) | `cve_data.json` + `cve_data_schema.py`; runner loads from JSON; backward-compat `CVE_TRIGGERS` dict preserved |
| CVE feed refresh job (T16+21) | OSV.dev + GitHub Advisories fetch for 26 packages; weekly APScheduler job (Mon 06:00); startup stale-check; `POST /api/cve-feed/refresh`; Notification row on new CVEs |
| Confidence scoring layer (T17) | `core/scoring.py` — 0–10 risk score per finding; wired into `scan_repo()`; `score` + `h1_draft` DB columns; Alembic migration `c3d4e5f6a7b8` |
| H1 auto-draft (T18) | `sentinel/services/h1_draft.py`; LLM generates HackerOne-style report for score≥7 findings; collapsible panel in findings table |
| Score badge UI | Color-coded 0–10 risk badge in findings table (red≥9, orange≥7, yellow≥4, gray<4) |
| AI explainer prompt docs (T19/T20) | `docs/weekly_update_prompt.md`, `docs/quarterly_research_prompt.md` |
| Self-scan fixes (self-improvement) | Fixed `openai-no-max-tokens` LLM10 bug; simplified redundant except tuples; added 6× `# noqa: BLE001` FP annotations; extended `.hfscanignore` + `.agent-audit.yaml` |
| Alembic migrations | Schema versioning, Sprint 6 indexes + ShareLinks |
| Test suite | 422 tests; `test_cve_data_schema.py` (16), `test_cve_feed.py` (10 async), `test_scoring.py` (16), `test_h1_draft.py` (12) |
| SARIF 2.1.0 output | GitHub code-scanning compatible |
| `.hfscanignore` suppression | Path / rule / severity filters |
| Baseline workflow | Fingerprint-based new-findings-only mode |
---
## Near-term (v5.1) 🔜
### Authentication & multi-user
- [ ] Session-based login with password hashing (bcrypt)
- [ ] Per-user targets, scans, and notifications (currently hardcoded `user_id=1`)
- [ ] Role-based access: admin, analyst, read-only
- [ ] API tokens for CI/CD integrations
### CI/CD integration improvements
- [ ] Webhook trigger: POST to `/api/scan/webhook` to start a scan from GitHub Actions
- [ ] Status badge endpoint (already exists at `/badge/{target_id}`) — document in README
- [ ] PR comment integration: post findings summary to GitHub/GitLab PR via API
### Scanner coverage
- [ ] **Trivy** — container image and IaC scanning (Dockerfile + SBOM)
- [ ] **OSV-Scanner** — open-source vulnerability database (alternative to pip-audit)
- [ ] **Checkov** — Terraform / K8s / Dockerfile policy checks
- [ ] **truffleHog** — deep git history secret scan (alternative to gitleaks)
### Reporting
- [ ] CSV and XLSX export of findings
- [ ] SBOM (Software Bill of Materials) generation (CycloneDX / SPDX)
- [ ] Finding diff between two scans (regression view)
- [ ] Email report on scan completion (SMTP already wired, needs template)
---
## Medium-term (v5.2) 📅
### Performance & scalability
- [ ] Replace in-process `ThreadPoolExecutor` with a proper task queue (Celery + Redis or ARQ)
- [ ] PostgreSQL support (already parameterised via `DATABASE_URL`, needs integration test)
- [ ] Horizontal scaling: multiple Uvicorn workers with shared task queue
- [ ] Caching layer for HuggingFace API responses (reduce rate-limit hits)
### UI improvements
- [ ] Dark mode persistence (Alpine.js localStorage — partial)
- [ ] Bulk triage: apply status change to all selected findings
- [ ] Findings diff view: compare two scans side-by-side
- [ ] Target groups / tags for organising many monitored spaces
- [ ] Paginated findings table (currently loads all findings in one query)
- [ ] Keyboard shortcuts (e.g. `n`/`p` for next/prev finding, `x` to triage)
### AI Explainer
- [ ] Anthropic (Claude) backend
- [ ] Batch mode: explain all findings in a scan in one request (reduce API calls)
- [ ] Store explanations in DB; don't re-explain the same fingerprint twice
- [ ] Quality feedback button (👍 / 👎) to improve prompt tuning
### Onboarding
- [ ] Step-by-step first-run wizard is complete — but needs a "skip and seed demo data" button
- [ ] Demo scan against a known-vulnerable HF space for new users
---
## Long-term (v6.0) 🔮
### ML-powered triage
- [ ] ML model trained on triage decisions to auto-suggest status
- [ ] Anomaly detection: flag repos whose risk score changes sharply between scans
- [ ] Cluster similar findings (same rule, same file pattern) across all targets
### Policy engine
- [ ] Define organisational policies (e.g. "no ERROR findings in production spaces")
- [ ] Block HF Space deployment if policy violations found (via HF Spaces API)
- [ ] Policy-as-code: YAML-defined rules stored in the repo
### Integrations
- [ ] Slack / Teams alert webhook on high-severity findings
- [ ] Jira / Linear ticket creation from findings
- [ ] OPA (Open Policy Agent) for fine-grained authorization rules
- [ ] SCIM / SSO (Okta, Azure AD) for enterprise deployments
### Distributed scanning
- [ ] Agent model: lightweight scanner agents deployed close to target repos
- [ ] Central SENTINEL server aggregates results from multiple agents
- [ ] Support GitHub, GitLab, Bitbucket repos (not only HuggingFace)
---
## Technical debt 🧹
| Item | Severity | Notes |
|------|----------|-------|
| `user_id=1` hardcoded throughout sentinel/ | High | Blocks multi-user |
| `sentinel/services/scanner.py` test coverage at 22% | High | Core async worker needs deep async mock tests |
| `sentinel/routes/scan.py` test coverage at 36% | High | SSE + PDF export + triage routes uncovered |
| `sentinel/services/ai_explain.py` test coverage at 26% | Medium | Mock LLM client tests needed |
| `sentinel/jobs/scheduler.py` test coverage at 44% | Medium | Scheduler logic needs async mock tests |
| `sentinel/routes/kb.py` test coverage at 52% | Medium | KB CRUD (create/update/delete) untested |
| `sentinel/routes/share.py` test coverage at 50% | Medium | Share-view handler body not reached (importlib.reload issue) |
| **Coverage tracking note** | — | `importlib.reload()` in test fixtures prevents pytest-cov from tracking route handler bodies; effective coverage is higher than shown |
| `detect-secrets` JSON format fragile | Low | Pin version; upstream API changes |
| E2E Playwright tests require live server | Low | Improve fixture isolation |
| `pyproject.toml` and `pytest.ini` both define pytest config | Low | Consolidate into `pyproject.toml` |
| Gradio `app.py` is legacy | Low | Remove or move to `legacy/` once v5 is confirmed stable |
---
## Version history
| Version | Date | Highlights |
|---------|------|------------|
| v5.0 | May 2026 | Sentinel FastAPI app, per-tool selection, html2pdf export, bootstrap binaries |
| v4.0 | 2025 | Gradio UI, SARIF output, CLI, Semgrep rule packs, baseline workflow |
| v3.x | 2025 | Multi-tool parallel scanning, ThreadPoolExecutor |
| v1–v2 | 2024 | Initial single-tool scanner, Bandit only |