Spaces:

XcodeAddy
/

sentinel-env

Running

File size: 4,952 Bytes

# SENTINEL Visual System

This file is the diagram source of truth. Every diagram used in README, UI, blog, or slides should be derived from here.

## Diagram Inventory

| Diagram | Purpose | Status |
| --- | --- | --- |
| System stack | show the code architecture | ready |
| Episode lifecycle | explain `reset()` to terminal reward | ready |
| Trust and reward flow | show how state turns into learning signal | ready |
| Reward engine v2 | show process-aware reward components | ready |
| Before / after | show why SENTINEL matters | ready |
| Theme fit | map the project to the hackathon | ready |
| Training loop | show OpenEnv -> TRL / Unsloth pipeline | ready |

---

## 1. System Stack

```mermaid
flowchart TD
  A["HTTP client / UI / inference.py"] --> B["app.py<br/>FastAPI on port 7860"]
  B --> C["SentinelEnv<br/>environment.py"]
  B --> D["_sessions<br/>session_id -> SentinelEnv"]
  C --> E["TaskGraph<br/>task_graph.py"]
  C --> F["TrustLedger<br/>trust_ledger.py"]
  C --> G["SpecialistPool<br/>specialists.py"]
  C --> H["RewardEngine<br/>graders.py"]
  C --> I["Scenario dataset<br/>scenarios.py"]
  C --> J["Typed models<br/>models.py"]
  B --> K["openenv.yaml"]
  B --> L["static/index.html"]
```

---

## 2. Episode Lifecycle

```mermaid
flowchart TD
  A["reset(task_type, seed)"] --> B["sample scenario"]
  B --> C["reshuffle hidden specialist profiles"]
  C --> D["set trust priors to 0.50"]
  D --> E["build task graph"]
  E --> F["return first observation"]

  F --> G["orchestrator chooses action"]
  G --> H["delegate / verify / self solve / skip"]
  H --> I["specialist or self execution"]
  I --> J["record outcome in TaskGraph"]
  J --> K["update TrustLedger"]
  K --> L["compute step reward"]
  L --> M{"done?"}
  M -- "no" --> N["return next observation"]
  N --> G
  M -- "yes" --> O["compute terminal reward"]
  O --> P["return done=True with final info"]
```

---

## 3. Trust And Reward Flow

```mermaid
flowchart LR
  A["Observation<br/>subtask, stakes, trust snapshot"] --> B["Action choice"]
  B --> C["Specialist result<br/>outcome, confidence, adversarial flag, step_cost"]
  C --> D["TaskGraph update"]
  C --> E["TrustLedger Bayesian update"]
  D --> F["completion, detections, poisonings"]
  E --> G["calibration state"]
  F --> H["RewardEngine"]
  G --> H
  H --> I["step reward"]
  H --> J["terminal reward"]
```

---

## 4. Reward Engine V2

```mermaid
flowchart LR
  A["Specialist result<br/>outcome, confidence, metadata"] --> B["Step reward"]
  C["TaskGraph<br/>completion, detections, poisonings"] --> D["Terminal reward"]
  E["TrustLedger<br/>calibration, fingerprints"] --> D

  B --> B1["task accuracy"]
  B --> B2["stakes awareness"]
  B --> B3["efficiency"]
  B --> B4["confidence alignment"]
  B --> B5["verification quality"]
  B --> B6["domain routing"]

  D --> D1["completion rate"]
  D --> D2["detection rate"]
  D --> D3["trust calibration"]
  D --> D4["episode efficiency"]

  B --> R["reward-report endpoint"]
  D --> R
  R --> T["component trace for judges"]
```

---

## 5. Before / After

```mermaid
flowchart LR
  subgraph BEFORE["Before SENTINEL"]
    A1["Uniform trust"] --> A2["Blind delegation"]
    A2 --> A3["Poison accepted at high stakes"]
    A3 --> A4["Downstream subtasks inherit bad state"]
    A4 --> A5["Mission drifts or fails"]
  end

  subgraph AFTER["After SENTINEL"]
    B1["Behavior updates trust"] --> B2["Low-trust high-stakes node detected"]
    B2 --> B3["Verify instead of delegate"]
    B3 --> B4["Poison blocked before cascade"]
    B4 --> B5["Mission completes cleanly"]
  end
```

---

## 6. Theme Fit

```mermaid
flowchart TD
  S["SENTINEL"] --> T1["Theme 1<br/>multi-agent interaction"]
  S --> T2["Theme 2<br/>long-horizon planning"]
  S --> T4["Theme 4<br/>self-improvement"]
  S --> T5["Theme 5<br/>wild card"]

  T1 --> B1["orchestrator + five specialists<br/>partial observability<br/>adversarial dynamics"]
  T2 --> B2["task graph<br/>step budget pressure<br/>delayed terminal reward"]
  T4 --> B3["profile reshuffle<br/>auto-curriculum<br/>no memorization"]
  T5 --> B4["real production weakness<br/>blind trust in agent pipelines"]
```

---

## 7. Training Loop

```mermaid
flowchart LR
  A["Prompt / observation"] --> B["Model rollout"]
  B --> C["Action text or structured action"]
  C --> D["SENTINEL environment"]
  D --> E["Reward + next observation"]
  E --> F["TRL / GRPO trainer"]
  F --> G["updated policy"]
  G --> B

  H["training/evaluate.py"] --> I["random / heuristic / oracle-lite"]
  I --> J["evaluation_results.json"]
  I --> K["baseline_comparison.png"]
```

---

## Use Rules

1. Do not invent new component names in slide decks that do not exist in code.
2. Use `SentinelEnv`, `TrustLedger`, `SpecialistPool`, `TaskGraph`, `RewardEngine` consistently.
3. Use real baseline numbers in public before/after materials.
4. Export polished PNG versions from these mermaid sources later, but keep this file as the editable truth.