autoscan / docs /roadmap.md
Chris4K's picture
Upload 384 files
a2a5bfd verified

Roadmap

This document tracks planned features, improvements, and known technical debt for autoscan / SENTINEL.

Items are grouped by priority tier. The "Done" section is a record of completed milestones.


Done โœ… (v5.0)

Feature Notes
FastAPI web application (Sentinel) Replaces Gradio-only UI; multi-user ready
HuggingFace Space discovery Search, filter by stage / hardware / framework / MCP
Parallel scan execution ThreadPoolExecutor, SSE live progress stream
Per-tool scanner selection Individual tools selectable in Discover UI and API
html2pdf.js export Client-side PDF, no server dependency
Notifications panel Bell icon, mark-read, delete
Bootstrap binaries Auto-download gitleaks + hadolint on startup
Share links Time-limited read-only scan URLs
Insights page Severity breakdown, 14-day trend, top targets
Knowledge Base Searchable remediation articles
Schedules (APScheduler) Cron-based automated scans
AI Explainer Ollama / OpenAI per-finding annotations
CVE data externalization (T15) cve_data.json + cve_data_schema.py; runner loads from JSON; backward-compat CVE_TRIGGERS dict preserved
CVE feed refresh job (T16+21) OSV.dev + GitHub Advisories fetch for 26 packages; weekly APScheduler job (Mon 06:00); startup stale-check; POST /api/cve-feed/refresh; Notification row on new CVEs
Confidence scoring layer (T17) core/scoring.py โ€” 0โ€“10 risk score per finding; wired into scan_repo(); score + h1_draft DB columns; Alembic migration c3d4e5f6a7b8
H1 auto-draft (T18) sentinel/services/h1_draft.py; LLM generates HackerOne-style report for scoreโ‰ฅ7 findings; collapsible panel in findings table
Score badge UI Color-coded 0โ€“10 risk badge in findings table (redโ‰ฅ9, orangeโ‰ฅ7, yellowโ‰ฅ4, gray<4)
AI explainer prompt docs (T19/T20) docs/weekly_update_prompt.md, docs/quarterly_research_prompt.md
Self-scan fixes (self-improvement) Fixed openai-no-max-tokens LLM10 bug; simplified redundant except tuples; added 6ร— # noqa: BLE001 FP annotations; extended .hfscanignore + .agent-audit.yaml
Alembic migrations Schema versioning, Sprint 6 indexes + ShareLinks
Test suite 422 tests; test_cve_data_schema.py (16), test_cve_feed.py (10 async), test_scoring.py (16), test_h1_draft.py (12)
SARIF 2.1.0 output GitHub code-scanning compatible
.hfscanignore suppression Path / rule / severity filters
Baseline workflow Fingerprint-based new-findings-only mode

Near-term (v5.1) ๐Ÿ”œ

Authentication & multi-user

  • Session-based login with password hashing (bcrypt)
  • Per-user targets, scans, and notifications (currently hardcoded user_id=1)
  • Role-based access: admin, analyst, read-only
  • API tokens for CI/CD integrations

CI/CD integration improvements

  • Webhook trigger: POST to /api/scan/webhook to start a scan from GitHub Actions
  • Status badge endpoint (already exists at /badge/{target_id}) โ€” document in README
  • PR comment integration: post findings summary to GitHub/GitLab PR via API

Scanner coverage

  • Trivy โ€” container image and IaC scanning (Dockerfile + SBOM)
  • OSV-Scanner โ€” open-source vulnerability database (alternative to pip-audit)
  • Checkov โ€” Terraform / K8s / Dockerfile policy checks
  • truffleHog โ€” deep git history secret scan (alternative to gitleaks)

Reporting

  • CSV and XLSX export of findings
  • SBOM (Software Bill of Materials) generation (CycloneDX / SPDX)
  • Finding diff between two scans (regression view)
  • Email report on scan completion (SMTP already wired, needs template)

Medium-term (v5.2) ๐Ÿ“…

Performance & scalability

  • Replace in-process ThreadPoolExecutor with a proper task queue (Celery + Redis or ARQ)
  • PostgreSQL support (already parameterised via DATABASE_URL, needs integration test)
  • Horizontal scaling: multiple Uvicorn workers with shared task queue
  • Caching layer for HuggingFace API responses (reduce rate-limit hits)

UI improvements

  • Dark mode persistence (Alpine.js localStorage โ€” partial)
  • Bulk triage: apply status change to all selected findings
  • Findings diff view: compare two scans side-by-side
  • Target groups / tags for organising many monitored spaces
  • Paginated findings table (currently loads all findings in one query)
  • Keyboard shortcuts (e.g. n/p for next/prev finding, x to triage)

AI Explainer

  • Anthropic (Claude) backend
  • Batch mode: explain all findings in a scan in one request (reduce API calls)
  • Store explanations in DB; don't re-explain the same fingerprint twice
  • Quality feedback button (๐Ÿ‘ / ๐Ÿ‘Ž) to improve prompt tuning

Onboarding

  • Step-by-step first-run wizard is complete โ€” but needs a "skip and seed demo data" button
  • Demo scan against a known-vulnerable HF space for new users

Long-term (v6.0) ๐Ÿ”ฎ

ML-powered triage

  • ML model trained on triage decisions to auto-suggest status
  • Anomaly detection: flag repos whose risk score changes sharply between scans
  • Cluster similar findings (same rule, same file pattern) across all targets

Policy engine

  • Define organisational policies (e.g. "no ERROR findings in production spaces")
  • Block HF Space deployment if policy violations found (via HF Spaces API)
  • Policy-as-code: YAML-defined rules stored in the repo

Integrations

  • Slack / Teams alert webhook on high-severity findings
  • Jira / Linear ticket creation from findings
  • OPA (Open Policy Agent) for fine-grained authorization rules
  • SCIM / SSO (Okta, Azure AD) for enterprise deployments

Distributed scanning

  • Agent model: lightweight scanner agents deployed close to target repos
  • Central SENTINEL server aggregates results from multiple agents
  • Support GitHub, GitLab, Bitbucket repos (not only HuggingFace)

Technical debt ๐Ÿงน

Item Severity Notes
user_id=1 hardcoded throughout sentinel/ High Blocks multi-user
sentinel/services/scanner.py test coverage at 22% High Core async worker needs deep async mock tests
sentinel/routes/scan.py test coverage at 36% High SSE + PDF export + triage routes uncovered
sentinel/services/ai_explain.py test coverage at 26% Medium Mock LLM client tests needed
sentinel/jobs/scheduler.py test coverage at 44% Medium Scheduler logic needs async mock tests
sentinel/routes/kb.py test coverage at 52% Medium KB CRUD (create/update/delete) untested
sentinel/routes/share.py test coverage at 50% Medium Share-view handler body not reached (importlib.reload issue)
Coverage tracking note โ€” importlib.reload() in test fixtures prevents pytest-cov from tracking route handler bodies; effective coverage is higher than shown
detect-secrets JSON format fragile Low Pin version; upstream API changes
E2E Playwright tests require live server Low Improve fixture isolation
pyproject.toml and pytest.ini both define pytest config Low Consolidate into pyproject.toml
Gradio app.py is legacy Low Remove or move to legacy/ once v5 is confirmed stable

Version history

Version Date Highlights
v5.0 May 2026 Sentinel FastAPI app, per-tool selection, html2pdf export, bootstrap binaries
v4.0 2025 Gradio UI, SARIF output, CLI, Semgrep rule packs, baseline workflow
v3.x 2025 Multi-tool parallel scanning, ThreadPoolExecutor
v1โ€“v2 2024 Initial single-tool scanner, Bandit only