Sentinel / README.md
nihalaninihal's picture
Revamp Gradio app with Gradio 6, custom cybersecurity theme, and rich visualizations
f20603d
metadata
title: SentinelOps Arena
emoji: πŸ›‘οΈ
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 6.9.0
app_file: app.py
pinned: false

SentinelOps Arena

Multi-agent self-play RL environment for enterprise security training, built on OpenEnv for the OpenEnv Hackathon SF (March 7-8, 2026).

Three AI agents compete in a simulated enterprise environment:

  • RED TEAM (Attacker) β€” Launches schema drift, policy drift, social engineering, and rate limiting attacks
  • BLUE TEAM (Worker) β€” Handles customer requests across CRM, Billing, and Ticketing systems
  • AUDITOR (Oversight) β€” Monitors worker actions and flags policy violations

Through adversarial self-play with GRPO training, all three agents improve simultaneously.

Quick Start

# Setup
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run Gradio demo
python app.py

# Run HTTP server
python -m sentinelops_arena.server --port 8000

# Run demo script
python -m sentinelops_arena.demo

Project Structure

NexusEnv/
β”œβ”€β”€ sentinelops_arena/
β”‚   β”œβ”€β”€ models.py              # Action, Observation, State, data models
β”‚   β”œβ”€β”€ environment.py         # SentinelOpsArena (MCPEnvironment) β€” core env
β”‚   β”œβ”€β”€ systems/
β”‚   β”‚   β”œβ”€β”€ crm.py             # CRM simulator
β”‚   β”‚   β”œβ”€β”€ billing.py         # Billing simulator
β”‚   β”‚   └── ticketing.py       # Ticketing simulator
β”‚   β”œβ”€β”€ attacks.py             # 4 attack types (schema/policy drift, social eng, rate limit)
β”‚   β”œβ”€β”€ rewards.py             # Reward functions for all 3 agents
β”‚   β”œβ”€β”€ task_generator.py      # Customer task generation
β”‚   β”œβ”€β”€ demo.py                # Heuristic agents + episode runner
β”‚   β”œβ”€β”€ server.py              # HTTP/WebSocket server
β”‚   β”œβ”€β”€ test_phase1.py         # Unit tests
β”‚   └── test_environment.py    # Integration tests
β”œβ”€β”€ app.py                     # Gradio UI (HuggingFace Spaces)
β”œβ”€β”€ train.py                   # GRPO training script (Unsloth + TRL)
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ pyproject.toml
└── README.md

Architecture

3 Agents, 3 Systems, 30 Ticks per Episode

Each tick: Attacker acts β†’ Worker acts β†’ Oversight acts

Attack Types

  1. Schema Drift β€” Renames fields across all records. Worker must detect KeyError, call get_schema(), and adapt.
  2. Policy Drift β€” Changes business rules (refund windows, approval requirements). Worker must call get_current_policy().
  3. Social Engineering β€” Injects fake authority messages. Worker must resist manipulation.
  4. Rate Limiting β€” Throttles API calls. Worker must handle gracefully.

MCP Tools

19 tools exposed via FastMCP, organized by agent role:

  • Worker: lookup_customer, check_balance, issue_refund, create_ticket, get_schema, get_current_policy, etc.
  • Attacker: launch_attack, get_attack_budget
  • Oversight: flag_action, get_trajectory

Training

Uses GRPO (Group Relative Policy Optimization) with Unsloth + TRL:

# Train with Unsloth (recommended, 2x faster)
python train.py --use_unsloth --model_name unsloth/Qwen2.5-0.5B-Instruct

# Train without Unsloth
python train.py --model_name Qwen/Qwen2.5-0.5B-Instruct

See train.py for the full training pipeline.

Partner Tracks

  • Fleet AI β€” Scalable Oversight: the Oversight agent monitors and explains Worker behavior
  • Patronus AI β€” Schema Drift: schema and policy drift are core attack types

Tech Stack

  • OpenEnv 0.2.x β€” Environment framework
  • FastMCP β€” MCP tool server
  • Gradio β€” Demo UI
  • HuggingFace TRL β€” GRPO training
  • Unsloth β€” Fast fine-tuning (2x speed, 70% less VRAM)
  • Pydantic β€” Data validation

Tests

python sentinelops_arena/test_phase1.py
python sentinelops_arena/test_environment.py