Spaces:

iitian
/

SentinelAI

Running

App Files Files Community

SentinelAI / README.md

iitian

Serve Next.js SOC dashboard at /ui with FastAPI redirect from /.

81fe24b 2 days ago

preview code

raw

history blame contribute delete

8.54 kB

	---
	title: SentinelAI
	emoji: 🏃
	colorFrom: red
	colorTo: pink
	sdk: docker
	app_port: 7860
	pinned: false
	license: apache-2.0
	short_description: SentinelAI — Autonomous Multi-Agent AI SOC
	---

	This Hugging Face Space runs the FastAPI control plane in Docker (no bundled PostgreSQL/Redis). The API is served on port 7860; set `SKIP_DB=1` at build time for demo-grade startup.

	---

	---
	title: SentinelAI
	emoji: 🏃
	colorFrom: red
	colorTo: pink
	sdk: docker
	app_port: 7860
	pinned: false
	license: apache-2.0
	short_description: SentinelAI — Autonomous Multi-Agent AI SOC
	---

	This Hugging Face Space serves the SOC dashboard at `/ui/` (static Next.js behind FastAPI). Open the Space URL in a browser and you are redirected from `/` to the deck; API docs stay at `/docs`. The container uses `SKIP_DB=1` (no bundled PostgreSQL/Redis).

	---

	# SentinelAI — Autonomous Multi-Agent AI SOC

	SentinelAI is a hackathon-grade, production-shaped autonomous Security Operations Center. It continuously ingests telemetry through collector agents, normalizes and enriches events, runs multi-modal detection (rules + heuristics + optional LLM reasoning on AMD ROCm), correlates attack chains, scores risk, drafts analyst narratives, emits remediation, and fans out alerts—while a Next.js 15 command deck visualizes live operations.

	## Powered by AMD ROCm compute

	- Local open models: wire `OLLAMA_HOST` to an Ollama instance backed by AMD ROCm on Linux (`ollama/ollama:rocm` in `docker/docker-compose.yml` comments).
	- Parallel agents: FastAPI + `asyncio` execute enrichment, detection, correlation, and analyst tasks concurrently; GPU inference accelerates the analyst LLM path without shipping prompts to a proprietary SaaS.
	- Throughput: ROCm lowers per-token latency for Llama 3, Qwen 2.5, Mistral, or DeepSeek-class models so multiple agents can reason on overlapping incidents.

	## Architecture

	```mermaid
	flowchart TB
	subgraph Infra[Infrastructure]
	L[Linux auth/syslog]
	D[Docker / K8s / Cloud mocks]
	end
	C[Collector Agent]
	P[Parser Agent]
	N[Normalization Agent]
	E[Threat Enrichment Agent]
	T[Threat Detection Agent]
	G[LangGraph orchestration]
	X[Incident Correlation Agent]
	R[Risk Scoring Agent]
	A[AI Analyst Agent]
	M[Remediation Agent]
	AL[Alerting Agent]
	DB[(PostgreSQL)]
	RD[(Redis)]
	V[(Chroma optional)]
	UI[Next.js 15 Dashboard]

	Infra --> C
	C --> P --> N --> E --> T
	T --> G
	G --> X --> R --> A --> M
	E -.intel.-> V
	T --> DB
	X --> DB
	AL --> RD
	A --> UI
	T --> UI
	```

	## Repository layout

	\| Path \| Role \|
	\| --- \| --- \|
	\| `frontend/` \| Next.js 15 + Tailwind + shadcn + Framer Motion SOC deck \|
	\| `backend/app/main.py` \| FastAPI control plane + WebSockets \|
	\| `agents/` \| Threat, risk, analyst, remediation, alerting logic \|
	\| `collectors/` \| Autonomous async tailing collectors \|
	\| `parsers/` \| Log → structured `SecurityEvent` \|
	\| `workflows/` \| LangGraph multi-agent DAG \|
	\| `database/` \| SQLAlchemy models + async session \|
	\| `models/` \| Shared Pydantic schemas \|
	\| `services/` \| Pipeline, hub, metrics, optional Chroma \|
	\| `docker/` \| Compose + GPU-ready notes \|
	\| `scripts/` \| Demo attack replay \|

	## Quick start (local)

	```bash
	cd SentinelAI
	python3 -m venv .venv && source .venv/bin/activate
	pip install -r requirements.txt
	cp .env.example .env
	# optional: start postgres + redis, or export SKIP_DB=1 for demo-only persistence
	export PYTHONPATH=$PWD
	export SKIP_DB=1 # remove when PostgreSQL is available
	./scripts/run-backend-dev.sh
	```

	Use `./scripts/run-backend-dev.sh` instead of `uvicorn ... --reload` from the repo root: reloading the whole tree also watches `.venv/site-packages` and can restart endlessly. The script scopes `--reload-dir` to Python source folders only.

	```bash
	cd frontend
	npm install
	export NEXT_PUBLIC_API_URL=http://127.0.0.1:8000
	npm run dev:22
	```

	Use `npm run dev:22` (Node 22) if `npm run dev` fails with a Next.js `semver` error on newer Node versions.

	Replay the scripted attack chain:

	```bash
	python scripts/demo_attack.py
	```

	Continuous demo stream (keeps generating traffic for judges):

	```bash
	python scripts/continuous_demo.py
	```

	Linux auth.log (production-style): set `COLLECT_AUTH_LOG=1` (and optionally `AUTH_LOG_PATH`) or add paths to `COLLECTOR_FILE_PATHS`. The collector waits until the file exists and tails new lines asynchronously.

	Attack replay (WOW): after traffic has populated the buffer, call `POST /replay/start` with `{"delay_ms": 420}` or use the dashboard Replay last chain button to re-broadcast buffered detections/incidents over WebSockets.

	vLLM / OpenAI-compatible inference: set `VLLM_BASE_URL` (or `OPENAI_BASE_URL`) and `SENTINEL_LLM_MODEL` to your served model; analyst reports use `/v1/chat/completions` before falling back to Ollama.

	The UI listens on `NEXT_PUBLIC_API_URL` and opens a WebSocket to `/live-events`.

	## Docker Compose

	```bash
	docker compose -f docker/docker-compose.yml up --build
	```

	- API: `http://localhost:8000`
	- UI: `http://localhost:3000`
	- Uncomment the `ollama` service for ROCm hosts and align `OLLAMA_HOST`.

	Install optional vector memory:

	```bash
	pip install -r requirements-optional.txt
	```

	## Required API surface

	\| Endpoint \| Description \|
	\| --- \| --- \|
	\| `POST /ingest-logs` \| Push raw logs / JSON events \|
	\| `WS /live-events` \| Real-time detections + incidents \|
	\| `POST /detect-threats` \| Parser → enrich → detect \|
	\| `POST /correlate-incidents` \| Recompute chains \|
	\| `POST /generate-summary` \| Body: `{ "incident_id": "..." }` \|
	\| `POST /remediation` \| Body: `{ "incident_id": "..." }` \|
	\| `POST /send-alert` \| Slack / Discord / Teams / webhook \|
	\| `GET /dashboard-metrics` \| KPIs for the deck \|
	\| `POST /replay/start` \| Re-stream buffered threat frames to WebSocket clients \|
	\| `GET /replay-buffer` \| Inspect replay buffer (debug) \|
	\| `GET /rocm-panel` \| AMD ROCm story + simulated GPU/agent load for the UI \|

	## Open-source model matrix

	\| Role \| Suggested weights \|
	\| --- \| --- \|
	\| Reasoning \| Llama 3, Qwen 2.5, DeepSeek, Mistral \|
	\| Vision (future) \| Qwen-VL, LLaVA for phishing/malware screenshots \|
	\| Embeddings \| BGE, E5 (plug into Chroma ingestion) \|

	Set `SENTINEL_LLM_MODEL` to the tag served by your ROCm Ollama runtime.

	## Live demo script (judges)

	1. Start stack — Docker Compose or local `uvicorn` + `npm run dev`.
	2. Show autonomous collection — tail `demo_logs/auth_demo.log` without manual uploads.
	3. Fire demo — `python scripts/demo_attack.py` or the in-UI Simulate attack chain button.
	4. Narrate agents — Collector → Parser → Normalization → Enrichment → Detection → LangGraph hop → Correlation → Risk → (optional) Analyst LLM on ROCm.
	5. Pivot to response — call `/remediation` + `/send-alert` with a webhook sink.
	6. Close with differentiation — autonomous agents, not a chatbot; on-prem models on AMD GPUs; evidence in PostgreSQL.

	## Pitch deck outline (copy into Slides / Gamma)

	1. Problem — SOC teams drown in telemetry; correlation is manual; cloud-only AI breaks data residency.
	2. Solution — SentinelAI fuses autonomous collectors, graph-based correlation, and open-weight LLMs.
	3. Why now — AMD ROCm makes on-prem inference cost-viable; LangGraph standardizes agent choreography.
	4. Demo — live WebSocket feed + incident graph + analyst summary.
	5. Moat — modular agents, MITRE mapping, optional TI hooks, Terraform-ready remediation stubs.
	6. Ask — design partners for managed SOC + on-prem appliance.

	## Demo & pitch (read before presenting)

	- Exact demo steps: [docs/DEMO_SCRIPT.md](docs/DEMO_SCRIPT.md)
	- One-line pitch: [docs/PITCH.md](docs/PITCH.md)
	- Backup recording: [docs/RECORDING_CHECKLIST.md](docs/RECORDING_CHECKLIST.md)
	- AMD panel API: `GET /rocm-panel` (drives the “Powered by AMD ROCm” dashboard section)

	## Judge explanation notes

	- Autonomy: collectors run continuously; pipeline executes without human prompts.
	- Multi-agent: LangGraph DAG + discrete services per concern (enrichment vs detection vs correlation).
	- Enterprise UX: glassmorphism SOC deck, severity analytics, world heatmap, terminal channel.
	- Honest scope: optional APIs (AbuseIPDB, VT, OTX) degrade gracefully; LLM path falls back to deterministic narratives if Ollama is offline.

	## Security notice

	This repository ships defensive tooling and demo payloads. Only run against systems you own or have permission to test.