OpenSecOpsEnv2 / docs /code_explainer.md
SapphireGaze429's picture
Please work
b595345
|
Raw
History Blame Contribute Delete
9.53 kB

OpenSecOpsEnv β€” Complete Code Reference

Auto-generated reference for the current codebase. Last updated: April 2026.


Project Structure

incident-ai/
β”œβ”€β”€ README.md                        # Main submission README (judges start here)
β”œβ”€β”€ hf_blog_post.md                  # HF blog post (copy to model card)
β”œβ”€β”€ colab_training (2).ipynb         # GRPO training notebook (run on A100)
β”œβ”€β”€ training_results.png             # Training plots (reward + loss + before/after)
β”œβ”€β”€ openenv.yaml                     # OpenEnv manifest
β”œβ”€β”€ pyproject.toml                   # Package config
β”œβ”€β”€ requirements.txt                 # Runtime dependencies
β”œβ”€β”€ Dockerfile                       # Container for HF Spaces deployment
β”œβ”€β”€ inference.py                     # Standalone OpenEnv inference runner
β”œβ”€β”€ demo.py                          # Local demo script
β”‚
β”œβ”€β”€ opensecops_env/                  # Core Python package
β”‚   β”œβ”€β”€ __init__.py                  # Package init + version
β”‚   β”œβ”€β”€ env.py                       # ⭐ Core environment (reset/step/state)
β”‚   β”œβ”€β”€ grader.py                    # ⭐ Episode grader β†’ [0, 1] score
β”‚   β”œβ”€β”€ models.py                    # Data models (SecOpsAction, Observation, etc.)
β”‚   β”œβ”€β”€ client.py                    # OpenEnv client wrapper
β”‚   β”œβ”€β”€ inference.py                 # Inference utilities
β”‚   β”œβ”€β”€ tasks/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── task_definitions.py     # ⭐ 4 task configs (easyβ†’hard)
β”‚   └── server/
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── app.py                  # ⭐ FastAPI server + dashboard + SSE streams
β”‚
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ train_grpo.py               # Standalone GRPO training script
β”‚   └── plot_rewards.py             # Generate training_results.png
β”‚
β”œβ”€β”€ tests/
β”‚   └── test_opensecops.py          # 33 unit tests
β”‚
β”œβ”€β”€ docs/                           # Internal documentation
β”‚   β”œβ”€β”€ DASHBOARD_GUIDE.md          # Plain-English dashboard explanation
β”‚   β”œβ”€β”€ TECHNICAL_ANALYSIS.md       # Full pipeline + theme alignment
β”‚   β”œβ”€β”€ analysis_and_next_steps.md  # Session notes
β”‚   β”œβ”€β”€ code_explainer.md           # This file
β”‚   └── walkthrough.md              # Development walkthrough

Core Environment: opensecops_env/env.py

Class: OpenSecOpsEnv

The main OpenEnv-compliant environment. Implements reset(), step(), and state.

env = OpenSecOpsEnv()
obs = env.reset("hard_data_exfiltration")  # returns SecOpsObservation
obs, reward, done, info = env.step(SecOpsAction(
    action_type="query_logs",
    parameters={"service": "db"}
))
result = grade(env.state.to_dict())

Key internal state:

  • env._hidden: HiddenState β€” ground truth (true_root_cause, affected_services, attack_progress, noise_level)
  • env._metrics: dict[str, ServiceMetrics] β€” current CPU/mem/latency/error_rate per service
  • env._rng: random.Random β€” seeded RNG; overridden per episode for variety
  • env._task_cfg: dict β€” full task config from task_definitions.py
  • env._state: EpisodeState β€” tracks investigation_actions, mitigation_actions, step_count, done

Reward logic (inside env.step()):

  • Investigating the wrong service: -0.05
  • Investigating an affected service (logs/scan): +0.20 to +0.30
  • Correct mitigation on affected service: +0.50
  • Wrong mitigation/harmful action: -0.10 to -0.50
  • Correct final diagnosis: +1.00
  • Wrong final diagnosis: -1.00
  • Per-step cost: -0.02

Task Definitions: opensecops_env/tasks/task_definitions.py

4 tasks with fixed seeds (overridden per episode by _randomise_env_seed()):

ID Difficulty Seed Noise Affected Services Correct Label
easy_memory_leak Easy 42 5% auth infra_failure:memory_leak
medium_ddos_cascade Medium 123 25% gateway, api cyber_attack:ddos
medium_hard_bad_deployment Med-Hard 456 35% api, cache misconfiguration:bad_config
hard_data_exfiltration Hard 789 55% db, auth cyber_attack:data_exfiltration

Each task config includes: initial_metrics, initial_alerts, initial_logs, topology, correct_mitigations, attack_progress_start.


Grader: opensecops_env/grader.py

def grade(episode_state: dict) -> GradeResult:
    score = (
        0.5 * diagnosis_correct        # Was the final label correct?
      + 0.3 * action_efficiency        # Were actions targeted? Or scattered?
      + 0.2 * investigation_quality    # Did agent query/scan affected services?
    )
  • diagnosis_correct: 1.0 if exact match, 0.5 if correct category, 0.0 if wrong
  • action_efficiency: 0.7 * mitigation_recall + 0.3 * step_bonus
  • investigation_quality: fraction of affected services that were investigated

Score is clamped to [0.01, 0.99].


Multi-Agent System: opensecops_env/server/app.py

Class: MultiAgentSecOpsEnv

Wraps OpenSecOpsEnv with two agents sharing the same environment state.

ma_env = MultiAgentSecOpsEnv()
state = ma_env.reset("hard_data_exfiltration")

# Red (Attacker) acts first
state, red_reward, done, info = ma_env.red_step()  # heuristic auto

# Blue (Defender) acts
action = SecOpsAction(action_type="query_logs", parameters={"service": "db"})
state, blue_reward, done, info = ma_env.blue_step(action)

Red Agent Strategy (_heuristic_red_action)

Adaptive 5-tier theory-of-mind strategy:

  1. Counter-investigate: If Blue queried service X in last 3 steps β†’ plant false alert on service Y
  2. Amplify: If cyber_attack and attack_progress < 0.85 and Blue hasn't isolated β†’ amplify
  3. Spread: Spread to services Blue hasn't investigated yet via topology graph
  4. Corrupt: Spike metrics on healthy services Blue has already looked at (plant doubt)
  5. Inject noise: Default β€” add misleading log entries

Class: CurriculumManager

_curriculum.record_score(task_id, score)  # Called after every episode
_curriculum.current_level                 # 1-5
_curriculum.episode_count                 # total episodes this session

Level-up logic: rolling window of last 5 episodes for current level. If avg >= threshold β†’ current_level += 1.


SSE Streams: /demo/stream and /battle/stream

Both endpoints return text/event-stream with JSON events:

Agent stream events:

  • reset β€” initial state + config
  • step β€” action taken, reward, observation update, raw AI JSON
  • grade β€” final scores + curriculum level
  • error β€” exception message

Battle stream events:

  • battle_reset β€” initial state
  • red_step β€” attacker action + damage
  • blue_step β€” defender action + reward + AI output
  • battle_end β€” final scores + winner + curriculum level

Live AI Integration: _query_ai_model()

async def _query_ai_model(endpoint, obs_dict, step) -> Optional[SecOpsAction]:
    # Build text prompt from observation
    prompt = _obs_to_text(obs_dict, step)
    
    # POST to HF Inference Endpoint
    payload = {
        "inputs": prompt,
        "parameters": {"max_new_tokens": 128, "temperature": 0.3, "return_full_text": False}
    }
    headers = {"Authorization": f"Bearer {_HF_API_TOKEN}"}
    
    # Parse response (handles multiple output formats from the model)
    return _parse_ai_action(response_text)

Auth: Set HF_TOKEN in .env file. Auto-loaded via python-dotenv at startup.

Debug: GET http://localhost:8000/debug/ai to test live endpoint.

Fallback: If endpoint call fails, falls back to deterministic heuristic playbook (never crashes dashboard).


Key Configuration

.env (gitignored)

HF_TOKEN=hf_xxxx
TRAINED_MODEL_ENDPOINT=https://xxx.endpoints.huggingface.cloud  # optional override

requirements.txt

fastapi>=0.111.0
uvicorn[standard]>=0.29.0
pydantic>=2.0.0
httpx>=0.27.0
python-dotenv>=1.0.0
openenv-core>=0.2.0

Running locally

cd incident-ai
.venv/bin/uvicorn opensecops_env.server.app:app --host 0.0.0.0 --port 8000 --reload
open http://localhost:8000/dashboard

Training Pipeline

GRPO Training (notebook: colab_training (2).ipynb)

# Reward function β€” wraps the environment
def secops_reward_fn(prompts, completions, **kwargs):
    for completion, task_id in zip(completions, task_ids):
        action = parse_action(completion)
        env.reset(task_id)
        _, reward, _, _ = env.step(action)
        rewards.append(float(reward) - 0.02)  # step cost
    return rewards

# Trainer config
GRPOConfig(
    num_generations=4,       # 4 candidate responses per observation
    max_new_tokens=128,
    temperature=0.9,         # High temp for exploration during training
    learning_rate=2e-5,
)

Model: Qwen2.5-7B-Instruct + Unsloth 4-bit + LoRA (r=16)
Merge: model.push_to_hub_merged(repo, tokenizer, save_method="merged_16bit")
Output: SapphireGaze429/opensecops-qwen2.5-7b-grpo


Tests: tests/test_opensecops.py

33 tests covering:

  • Environment reset/step API contract
  • All 4 task configs
  • All 9 action types
  • Reward bounds
  • Grader formula correctness
  • Partial diagnosis credit (category match)
pytest tests/ -v  # all 33 should pass