Spaces:

openenv-community
/

Sentinel

Running

App Files Files Community

Sentinel / README.md

nihalaninihal

Revamp Gradio app with Gradio 6, custom cybersecurity theme, and rich visualizations

f20603d 3 days ago

preview code

raw

history blame contribute delete

4.02 kB

	---
	title: SentinelOps Arena
	emoji: "\U0001F6E1\uFE0F"
	colorFrom: green
	colorTo: red
	sdk: gradio
	sdk_version: 6.9.0
	app_file: app.py
	pinned: false
	---

	# SentinelOps Arena

	Multi-agent self-play RL environment for enterprise security training, built on [OpenEnv](https://github.com/meta-pytorch/OpenEnv) for the [OpenEnv Hackathon SF](https://cerebralvalley.ai/e/openenv-hackathon-sf) (March 7-8, 2026).

	Three AI agents compete in a simulated enterprise environment:
	- RED TEAM (Attacker) — Launches schema drift, policy drift, social engineering, and rate limiting attacks
	- BLUE TEAM (Worker) — Handles customer requests across CRM, Billing, and Ticketing systems
	- AUDITOR (Oversight) — Monitors worker actions and flags policy violations

	Through adversarial self-play with GRPO training, all three agents improve simultaneously.

	## Quick Start

	```bash
	# Setup
	python3 -m venv .venv
	source .venv/bin/activate
	pip install -r requirements.txt

	# Run Gradio demo
	python app.py

	# Run HTTP server
	python -m sentinelops_arena.server --port 8000

	# Run demo script
	python -m sentinelops_arena.demo
	```

	## Project Structure

	```
	NexusEnv/
	├── sentinelops_arena/
	│ ├── models.py # Action, Observation, State, data models
	│ ├── environment.py # SentinelOpsArena (MCPEnvironment) — core env
	│ ├── systems/
	│ │ ├── crm.py # CRM simulator
	│ │ ├── billing.py # Billing simulator
	│ │ └── ticketing.py # Ticketing simulator
	│ ├── attacks.py # 4 attack types (schema/policy drift, social eng, rate limit)
	│ ├── rewards.py # Reward functions for all 3 agents
	│ ├── task_generator.py # Customer task generation
	│ ├── demo.py # Heuristic agents + episode runner
	│ ├── server.py # HTTP/WebSocket server
	│ ├── test_phase1.py # Unit tests
	│ └── test_environment.py # Integration tests
	├── app.py # Gradio UI (HuggingFace Spaces)
	├── train.py # GRPO training script (Unsloth + TRL)
	├── requirements.txt
	├── pyproject.toml
	└── README.md
	```

	## Architecture

	3 Agents, 3 Systems, 30 Ticks per Episode

	Each tick: Attacker acts → Worker acts → Oversight acts

	### Attack Types
	1. Schema Drift — Renames fields across all records. Worker must detect KeyError, call `get_schema()`, and adapt.
	2. Policy Drift — Changes business rules (refund windows, approval requirements). Worker must call `get_current_policy()`.
	3. Social Engineering — Injects fake authority messages. Worker must resist manipulation.
	4. Rate Limiting — Throttles API calls. Worker must handle gracefully.

	### MCP Tools
	19 tools exposed via FastMCP, organized by agent role:
	- Worker: lookup_customer, check_balance, issue_refund, create_ticket, get_schema, get_current_policy, etc.
	- Attacker: launch_attack, get_attack_budget
	- Oversight: flag_action, get_trajectory

	## Training

	Uses GRPO (Group Relative Policy Optimization) with Unsloth + TRL:

	```bash
	# Train with Unsloth (recommended, 2x faster)
	python train.py --use_unsloth --model_name unsloth/Qwen2.5-0.5B-Instruct

	# Train without Unsloth
	python train.py --model_name Qwen/Qwen2.5-0.5B-Instruct
	```

	See `train.py` for the full training pipeline.

	## Partner Tracks

	- Fleet AI — Scalable Oversight: the Oversight agent monitors and explains Worker behavior
	- Patronus AI — Schema Drift: schema and policy drift are core attack types

	## Tech Stack

	- OpenEnv 0.2.x — Environment framework
	- FastMCP — MCP tool server
	- Gradio — Demo UI
	- HuggingFace TRL — GRPO training
	- Unsloth — Fast fine-tuning (2x speed, 70% less VRAM)
	- Pydantic — Data validation

	## Tests

	```bash
	python sentinelops_arena/test_phase1.py
	python sentinelops_arena/test_environment.py
	```