File size: 9,530 Bytes
b595345
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
# OpenSecOpsEnv β€” Complete Code Reference

> **Auto-generated reference for the current codebase. Last updated: April 2026.**

---

## Project Structure

```
incident-ai/
β”œβ”€β”€ README.md                        # Main submission README (judges start here)
β”œβ”€β”€ hf_blog_post.md                  # HF blog post (copy to model card)
β”œβ”€β”€ colab_training (2).ipynb         # GRPO training notebook (run on A100)
β”œβ”€β”€ training_results.png             # Training plots (reward + loss + before/after)
β”œβ”€β”€ openenv.yaml                     # OpenEnv manifest
β”œβ”€β”€ pyproject.toml                   # Package config
β”œβ”€β”€ requirements.txt                 # Runtime dependencies
β”œβ”€β”€ Dockerfile                       # Container for HF Spaces deployment
β”œβ”€β”€ inference.py                     # Standalone OpenEnv inference runner
β”œβ”€β”€ demo.py                          # Local demo script
β”‚
β”œβ”€β”€ opensecops_env/                  # Core Python package
β”‚   β”œβ”€β”€ __init__.py                  # Package init + version
β”‚   β”œβ”€β”€ env.py                       # ⭐ Core environment (reset/step/state)
β”‚   β”œβ”€β”€ grader.py                    # ⭐ Episode grader β†’ [0, 1] score
β”‚   β”œβ”€β”€ models.py                    # Data models (SecOpsAction, Observation, etc.)
β”‚   β”œβ”€β”€ client.py                    # OpenEnv client wrapper
β”‚   β”œβ”€β”€ inference.py                 # Inference utilities
β”‚   β”œβ”€β”€ tasks/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── task_definitions.py     # ⭐ 4 task configs (easyβ†’hard)
β”‚   └── server/
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── app.py                  # ⭐ FastAPI server + dashboard + SSE streams
β”‚
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ train_grpo.py               # Standalone GRPO training script
β”‚   └── plot_rewards.py             # Generate training_results.png
β”‚
β”œβ”€β”€ tests/
β”‚   └── test_opensecops.py          # 33 unit tests
β”‚
β”œβ”€β”€ docs/                           # Internal documentation
β”‚   β”œβ”€β”€ DASHBOARD_GUIDE.md          # Plain-English dashboard explanation
β”‚   β”œβ”€β”€ TECHNICAL_ANALYSIS.md       # Full pipeline + theme alignment
β”‚   β”œβ”€β”€ analysis_and_next_steps.md  # Session notes
β”‚   β”œβ”€β”€ code_explainer.md           # This file
β”‚   └── walkthrough.md              # Development walkthrough
```

---

## Core Environment: `opensecops_env/env.py`

### Class: `OpenSecOpsEnv`

The main OpenEnv-compliant environment. Implements `reset()`, `step()`, and `state`.

```python
env = OpenSecOpsEnv()
obs = env.reset("hard_data_exfiltration")  # returns SecOpsObservation
obs, reward, done, info = env.step(SecOpsAction(
    action_type="query_logs",
    parameters={"service": "db"}
))
result = grade(env.state.to_dict())
```

**Key internal state:**
- `env._hidden: HiddenState` β€” ground truth (true_root_cause, affected_services, attack_progress, noise_level)
- `env._metrics: dict[str, ServiceMetrics]` β€” current CPU/mem/latency/error_rate per service
- `env._rng: random.Random` β€” seeded RNG; overridden per episode for variety
- `env._task_cfg: dict` β€” full task config from task_definitions.py
- `env._state: EpisodeState` β€” tracks investigation_actions, mitigation_actions, step_count, done

**Reward logic** (inside `env.step()`):
- Investigating the wrong service: `-0.05`
- Investigating an affected service (logs/scan): `+0.20` to `+0.30`
- Correct mitigation on affected service: `+0.50`
- Wrong mitigation/harmful action: `-0.10` to `-0.50`
- Correct final diagnosis: `+1.00`
- Wrong final diagnosis: `-1.00`
- Per-step cost: `-0.02`

---

## Task Definitions: `opensecops_env/tasks/task_definitions.py`

4 tasks with fixed seeds (overridden per episode by `_randomise_env_seed()`):

| ID | Difficulty | Seed | Noise | Affected Services | Correct Label |
|----|-----------|------|-------|-------------------|---------------|
| `easy_memory_leak` | Easy | 42 | 5% | auth | `infra_failure:memory_leak` |
| `medium_ddos_cascade` | Medium | 123 | 25% | gateway, api | `cyber_attack:ddos` |
| `medium_hard_bad_deployment` | Med-Hard | 456 | 35% | api, cache | `misconfiguration:bad_config` |
| `hard_data_exfiltration` | Hard | 789 | 55% | db, auth | `cyber_attack:data_exfiltration` |

Each task config includes: `initial_metrics`, `initial_alerts`, `initial_logs`, `topology`, `correct_mitigations`, `attack_progress_start`.

---

## Grader: `opensecops_env/grader.py`

```python
def grade(episode_state: dict) -> GradeResult:
    score = (
        0.5 * diagnosis_correct        # Was the final label correct?
      + 0.3 * action_efficiency        # Were actions targeted? Or scattered?
      + 0.2 * investigation_quality    # Did agent query/scan affected services?
    )
```

- `diagnosis_correct`: 1.0 if exact match, 0.5 if correct category, 0.0 if wrong
- `action_efficiency`: `0.7 * mitigation_recall + 0.3 * step_bonus`
- `investigation_quality`: fraction of affected services that were investigated

Score is clamped to `[0.01, 0.99]`.

---

## Multi-Agent System: `opensecops_env/server/app.py`

### Class: `MultiAgentSecOpsEnv`

Wraps `OpenSecOpsEnv` with two agents sharing the same environment state.

```python
ma_env = MultiAgentSecOpsEnv()
state = ma_env.reset("hard_data_exfiltration")

# Red (Attacker) acts first
state, red_reward, done, info = ma_env.red_step()  # heuristic auto

# Blue (Defender) acts
action = SecOpsAction(action_type="query_logs", parameters={"service": "db"})
state, blue_reward, done, info = ma_env.blue_step(action)
```

### Red Agent Strategy (`_heuristic_red_action`)

Adaptive 5-tier theory-of-mind strategy:
1. **Counter-investigate**: If Blue queried service X in last 3 steps β†’ plant false alert on service Y
2. **Amplify**: If cyber_attack and attack_progress < 0.85 and Blue hasn't isolated β†’ amplify
3. **Spread**: Spread to services Blue hasn't investigated yet via topology graph
4. **Corrupt**: Spike metrics on healthy services Blue has already looked at (plant doubt)
5. **Inject noise**: Default β€” add misleading log entries

### Class: `CurriculumManager`

```python
_curriculum.record_score(task_id, score)  # Called after every episode
_curriculum.current_level                 # 1-5
_curriculum.episode_count                 # total episodes this session
```

Level-up logic: rolling window of last 5 episodes for current level. If avg >= threshold β†’ `current_level += 1`.

---

## SSE Streams: `/demo/stream` and `/battle/stream`

Both endpoints return `text/event-stream` with JSON events:

**Agent stream events:**
- `reset` β€” initial state + config
- `step` β€” action taken, reward, observation update, raw AI JSON
- `grade` β€” final scores + curriculum level
- `error` β€” exception message

**Battle stream events:**
- `battle_reset` β€” initial state
- `red_step` β€” attacker action + damage
- `blue_step` β€” defender action + reward + AI output
- `battle_end` β€” final scores + winner + curriculum level

---

## Live AI Integration: `_query_ai_model()`

```python
async def _query_ai_model(endpoint, obs_dict, step) -> Optional[SecOpsAction]:
    # Build text prompt from observation
    prompt = _obs_to_text(obs_dict, step)
    
    # POST to HF Inference Endpoint
    payload = {
        "inputs": prompt,
        "parameters": {"max_new_tokens": 128, "temperature": 0.3, "return_full_text": False}
    }
    headers = {"Authorization": f"Bearer {_HF_API_TOKEN}"}
    
    # Parse response (handles multiple output formats from the model)
    return _parse_ai_action(response_text)
```

**Auth:** Set `HF_TOKEN` in `.env` file. Auto-loaded via `python-dotenv` at startup.

**Debug:** GET `http://localhost:8000/debug/ai` to test live endpoint.

**Fallback:** If endpoint call fails, falls back to deterministic heuristic playbook (never crashes dashboard).

---

## Key Configuration

### `.env` (gitignored)
```
HF_TOKEN=hf_xxxx
TRAINED_MODEL_ENDPOINT=https://xxx.endpoints.huggingface.cloud  # optional override
```

### `requirements.txt`
```
fastapi>=0.111.0
uvicorn[standard]>=0.29.0
pydantic>=2.0.0
httpx>=0.27.0
python-dotenv>=1.0.0
openenv-core>=0.2.0
```

### Running locally
```bash
cd incident-ai
.venv/bin/uvicorn opensecops_env.server.app:app --host 0.0.0.0 --port 8000 --reload
open http://localhost:8000/dashboard
```

---

## Training Pipeline

### GRPO Training (notebook: `colab_training (2).ipynb`)

```python
# Reward function β€” wraps the environment
def secops_reward_fn(prompts, completions, **kwargs):
    for completion, task_id in zip(completions, task_ids):
        action = parse_action(completion)
        env.reset(task_id)
        _, reward, _, _ = env.step(action)
        rewards.append(float(reward) - 0.02)  # step cost
    return rewards

# Trainer config
GRPOConfig(
    num_generations=4,       # 4 candidate responses per observation
    max_new_tokens=128,
    temperature=0.9,         # High temp for exploration during training
    learning_rate=2e-5,
)
```

**Model:** Qwen2.5-7B-Instruct + Unsloth 4-bit + LoRA (r=16)  
**Merge:** `model.push_to_hub_merged(repo, tokenizer, save_method="merged_16bit")`  
**Output:** `SapphireGaze429/opensecops-qwen2.5-7b-grpo`

---

## Tests: `tests/test_opensecops.py`

33 tests covering:
- Environment reset/step API contract
- All 4 task configs
- All 9 action types
- Reward bounds
- Grader formula correctness
- Partial diagnosis credit (category match)

```bash
pytest tests/ -v  # all 33 should pass
```