Roadmap
This document tracks planned features, improvements, and known technical debt for autoscan / SENTINEL.
Items are grouped by priority tier. The "Done" section is a record of completed milestones.
Done โ (v5.0)
| Feature | Notes |
|---|---|
| FastAPI web application (Sentinel) | Replaces Gradio-only UI; multi-user ready |
| HuggingFace Space discovery | Search, filter by stage / hardware / framework / MCP |
| Parallel scan execution | ThreadPoolExecutor, SSE live progress stream |
| Per-tool scanner selection | Individual tools selectable in Discover UI and API |
| html2pdf.js export | Client-side PDF, no server dependency |
| Notifications panel | Bell icon, mark-read, delete |
| Bootstrap binaries | Auto-download gitleaks + hadolint on startup |
| Share links | Time-limited read-only scan URLs |
| Insights page | Severity breakdown, 14-day trend, top targets |
| Knowledge Base | Searchable remediation articles |
| Schedules (APScheduler) | Cron-based automated scans |
| AI Explainer | Ollama / OpenAI per-finding annotations |
| CVE data externalization (T15) | cve_data.json + cve_data_schema.py; runner loads from JSON; backward-compat CVE_TRIGGERS dict preserved |
| CVE feed refresh job (T16+21) | OSV.dev + GitHub Advisories fetch for 26 packages; weekly APScheduler job (Mon 06:00); startup stale-check; POST /api/cve-feed/refresh; Notification row on new CVEs |
| Confidence scoring layer (T17) | core/scoring.py โ 0โ10 risk score per finding; wired into scan_repo(); score + h1_draft DB columns; Alembic migration c3d4e5f6a7b8 |
| H1 auto-draft (T18) | sentinel/services/h1_draft.py; LLM generates HackerOne-style report for scoreโฅ7 findings; collapsible panel in findings table |
| Score badge UI | Color-coded 0โ10 risk badge in findings table (redโฅ9, orangeโฅ7, yellowโฅ4, gray<4) |
| AI explainer prompt docs (T19/T20) | docs/weekly_update_prompt.md, docs/quarterly_research_prompt.md |
| Self-scan fixes (self-improvement) | Fixed openai-no-max-tokens LLM10 bug; simplified redundant except tuples; added 6ร # noqa: BLE001 FP annotations; extended .hfscanignore + .agent-audit.yaml |
| Alembic migrations | Schema versioning, Sprint 6 indexes + ShareLinks |
| Test suite | 422 tests; test_cve_data_schema.py (16), test_cve_feed.py (10 async), test_scoring.py (16), test_h1_draft.py (12) |
| SARIF 2.1.0 output | GitHub code-scanning compatible |
.hfscanignore suppression |
Path / rule / severity filters |
| Baseline workflow | Fingerprint-based new-findings-only mode |
Near-term (v5.1) ๐
Authentication & multi-user
- Session-based login with password hashing (bcrypt)
- Per-user targets, scans, and notifications (currently hardcoded
user_id=1) - Role-based access: admin, analyst, read-only
- API tokens for CI/CD integrations
CI/CD integration improvements
- Webhook trigger: POST to
/api/scan/webhookto start a scan from GitHub Actions - Status badge endpoint (already exists at
/badge/{target_id}) โ document in README - PR comment integration: post findings summary to GitHub/GitLab PR via API
Scanner coverage
- Trivy โ container image and IaC scanning (Dockerfile + SBOM)
- OSV-Scanner โ open-source vulnerability database (alternative to pip-audit)
- Checkov โ Terraform / K8s / Dockerfile policy checks
- truffleHog โ deep git history secret scan (alternative to gitleaks)
Reporting
- CSV and XLSX export of findings
- SBOM (Software Bill of Materials) generation (CycloneDX / SPDX)
- Finding diff between two scans (regression view)
- Email report on scan completion (SMTP already wired, needs template)
Medium-term (v5.2) ๐
Performance & scalability
- Replace in-process
ThreadPoolExecutorwith a proper task queue (Celery + Redis or ARQ) - PostgreSQL support (already parameterised via
DATABASE_URL, needs integration test) - Horizontal scaling: multiple Uvicorn workers with shared task queue
- Caching layer for HuggingFace API responses (reduce rate-limit hits)
UI improvements
- Dark mode persistence (Alpine.js localStorage โ partial)
- Bulk triage: apply status change to all selected findings
- Findings diff view: compare two scans side-by-side
- Target groups / tags for organising many monitored spaces
- Paginated findings table (currently loads all findings in one query)
- Keyboard shortcuts (e.g.
n/pfor next/prev finding,xto triage)
AI Explainer
- Anthropic (Claude) backend
- Batch mode: explain all findings in a scan in one request (reduce API calls)
- Store explanations in DB; don't re-explain the same fingerprint twice
- Quality feedback button (๐ / ๐) to improve prompt tuning
Onboarding
- Step-by-step first-run wizard is complete โ but needs a "skip and seed demo data" button
- Demo scan against a known-vulnerable HF space for new users
Long-term (v6.0) ๐ฎ
ML-powered triage
- ML model trained on triage decisions to auto-suggest status
- Anomaly detection: flag repos whose risk score changes sharply between scans
- Cluster similar findings (same rule, same file pattern) across all targets
Policy engine
- Define organisational policies (e.g. "no ERROR findings in production spaces")
- Block HF Space deployment if policy violations found (via HF Spaces API)
- Policy-as-code: YAML-defined rules stored in the repo
Integrations
- Slack / Teams alert webhook on high-severity findings
- Jira / Linear ticket creation from findings
- OPA (Open Policy Agent) for fine-grained authorization rules
- SCIM / SSO (Okta, Azure AD) for enterprise deployments
Distributed scanning
- Agent model: lightweight scanner agents deployed close to target repos
- Central SENTINEL server aggregates results from multiple agents
- Support GitHub, GitLab, Bitbucket repos (not only HuggingFace)
Technical debt ๐งน
| Item | Severity | Notes |
|---|---|---|
user_id=1 hardcoded throughout sentinel/ |
High | Blocks multi-user |
sentinel/services/scanner.py test coverage at 22% |
High | Core async worker needs deep async mock tests |
sentinel/routes/scan.py test coverage at 36% |
High | SSE + PDF export + triage routes uncovered |
sentinel/services/ai_explain.py test coverage at 26% |
Medium | Mock LLM client tests needed |
sentinel/jobs/scheduler.py test coverage at 44% |
Medium | Scheduler logic needs async mock tests |
sentinel/routes/kb.py test coverage at 52% |
Medium | KB CRUD (create/update/delete) untested |
sentinel/routes/share.py test coverage at 50% |
Medium | Share-view handler body not reached (importlib.reload issue) |
| Coverage tracking note | โ | importlib.reload() in test fixtures prevents pytest-cov from tracking route handler bodies; effective coverage is higher than shown |
detect-secrets JSON format fragile |
Low | Pin version; upstream API changes |
| E2E Playwright tests require live server | Low | Improve fixture isolation |
pyproject.toml and pytest.ini both define pytest config |
Low | Consolidate into pyproject.toml |
Gradio app.py is legacy |
Low | Remove or move to legacy/ once v5 is confirmed stable |
Version history
| Version | Date | Highlights |
|---|---|---|
| v5.0 | May 2026 | Sentinel FastAPI app, per-tool selection, html2pdf export, bootstrap binaries |
| v4.0 | 2025 | Gradio UI, SARIF output, CLI, Semgrep rule packs, baseline workflow |
| v3.x | 2025 | Multi-tool parallel scanning, ThreadPoolExecutor |
| v1โv2 | 2024 | Initial single-tool scanner, Bandit only |