meta_ai_hackathon / docs /ARCHITECTURE.md
GOOD CAT
Final submission prep
ec8c511

Architecture

System Diagram

flowchart LR
    A[TrafficGenerator] --> E[FirewallEnvironment]
    B[ThreatEngine] --> E
    E --> C[RewardEngine]
    E --> D[Graders]
    E --> F[FastAPI App]
    F --> G[Client / Agent]
    G --> F

Runtime Data Flow

sequenceDiagram
    participant Agent
    participant Env as FirewallEnvironment
    participant TG as TrafficGenerator
    participant TH as ThreatEngine
    participant RW as RewardEngine

    Agent->>Env: reset(task, seed)
    Env->>TG: generate_benign_sessions
    Env->>TH: maybe_spawn_attacker + generate_attack_sessions
    Env-->>Agent: state
    Agent->>Env: step(action_map) or step_single(action)
    Env->>RW: reward(action, is_malicious, budget_remaining, phase)
    Env-->>Agent: reward, done, info, next state

Core Components

Component Responsibility Key Outputs
firewall_environment.py Episode orchestration, budget tracking, session lifecycle, metrics state(), step(), step_single(), tool APIs
traffic_generator.py Benign + malicious metadata generation, normalization, scenario shaping 22-dim normalized observation vectors
threat_engine.py Multi-attacker orchestration, adaptation, lifecycle and outcomes Attack sessions, attacker status map
reward_engine.py Multi-objective reward calculation and action-cost accounting scalar reward + component breakdown
graders.py Deterministic task scoring and pass/fail gating score in [0,1], pass constraints
baseline/evaluate.py Policy benchmarking across tasks JSON report for random/heuristic/block/allow

Environment Modes

  • Multi-session mode: step(action_map) handles a variable batch of sessions per tick.
  • Single-session mode: step_single(action) exposes one decision at a time with Discrete(6) semantics.
  • Inspect workflow: inspect is first-stage evidence collection; follow-up action resolves the session.