-
----
-
-## ๐ What is NEXUS-AI?
-
-NEXUS is a next-generation, autonomous dual-agent environment designed to investigate and validate software incidents in real-time. Using a combination of an **Investigator** and a **Validator** agent, NEXUS autonomously forms hypotheses, executes systems tools, evaluates system behavior, and reaches strict consensus on root causes.
-
-Traditional manual debugging requires extensive context-switching and tool fatigue. NEXUS solves this through:
-1. **Dual-Agent Autonomy**: Two specialized models communicating word-by-word via WebSockets.
-2. **Dynamic Tool Execution**: Fully integrated system terminals allowing agents to run sandboxed validation scripts.
-3. **Semantic Reward Engine**: Evaluates conversational drift mathematically (using native GPU embeddings).
-
-The result: An AI "Incident Response Team" that navigates servers, traces logs, and fixes bugs identically to a human SRE.
-
----
-
-## ๐ผ๏ธ Application Screenshots
-
-### ๐ Simulation Dashboard
-
-> The core command center. Features live agent terminals, a dual-communication consensus log, and a mathematical performance reward graph plotting investigation confidence.
-
-
-
-
-
----
-
-## ๐๏ธ Scenario Registry & Core Settings
-
-> The system is architected for instant adaptability โ seamlessly switch LLM providers and inject custom threat models entirely through the frontend DOM.
-
-
-
-
-
- Scenario Registry
- A persistent LocalStorage-backed grid of tactical simulations. Users can dynamically inject custom infrastructure-specific incidents directly into the agent pipeline.
-
-
-
- Runtime Configuration
- Dynamically maps available locally-installed Ollama networks, allowing the user to pair models (e.g., Qwen vs Dolphin-Phi) with fully independent parameters.
-
-
-
-
----
-
-## ๐๏ธ System Architecture
-
-```text
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-โ CLIENT BROWSER โ
-โ React SPA (Tailwind + Framer Motion) โ
-โ localhost:5173 โ
-โโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
- โ HTTP (REST) โ ws://
- โผ โผ
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-โ FASTAPI BACKEND (localhost:7860) โ
-โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
-โ โ /config โ โ/scenariosโ โ /reset โ โ ws:// Simulator โ โ
-โ โ Env Sync โ โ DB Cache โ โ Injectionโ โ Live Stream Syncโ โ
-โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
-โโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ
- โ โ
- โผ โผ
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-โ OLLAMA ENGINE / LLM PIPELINE โ
-โ Agent A (Investigator) โโโโโโโโบ Agent B (Validator) โ
-โ - Generates Hypotheses - Challenges Assertions โ
-โ - Runs System Tools - Requires Proof โ
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-```
-
----
-
-## ๐ Execution Environments
-
-NEXUS-AI supports two distinct execution models for agent tools, toggleable via the **Settings** dashboard:
-
-### 1. Simulated Mode (Safe Sandbox)
-* **Default Mode**: Agents interact with a pre-defined `clue_map` within the scenario YAML.
-* **No System Impact**: Commands like `read_logs` or `check_service` return mocked data.
-* **Use Case**: Training, logic validation, and "what-if" analysis without infrastructure risk.
-
-### 2. SSH Lab Node (Real-World Execution)
-* **Live Connection**: Commands are executed in real-time on a remote Linux server via SSH.
-* **Autonomous Terminal**: Agents use the `run_terminal_command` tool to browse logs, check systemd status, and inspect real configs.
-* **Security**: Includes a command blocklist to prevent highly destructive operations (e.g., `rm -rf /`).
-* **Use Case**: Actual incident response on isolated Lab/Staging nodes.
-
----
-
-## ๐ OpenEnv Specification
-
-NEXUS-AI strictly adheres to the **OpenEnv 1.0** standard for agent-environment interaction.
-
-### ๐ฎ Action Space
-The environment accepts a typed **NexusAction** (Text-based with structured tool calls).
-- **agent_id**: `string` ("agent_a" or "agent_b")
-- **message**: `string` (The natural language reasoning/communication)
-- **tool_calls**: `List[ToolCall]` (Optional structured calls like `TOOL: read_logs(file='app.log')`)
-- **confidence**: `float` (0.0 - 1.0)
-
-### ๐ง Observation Space
-The environment returns a structured **NexusObservation** summarizing the system state.
-- **scenario_description**: `string` (High-level objective)
-- **scenario_context**: `string` (Background telemetry/environment info)
-- **partner_message**: `string` (The last message from the other agent)
-- **tool_results**: `List[ToolResult]` (Output of any executed system tools)
-- **clues_found**: `List[string]` (Accumulated evidence identified by the Reward Engine)
-- **investigation_stage**: `string` (`investigating`, `narrowing`, `found`, `verified`)
-- **round**: `integer` (Current episode round)
-- **available_tools**: `List[string]` (List of permitted tools for the current mode)
-
-### ๐ Task Registry & Difficulty
-| Task Name | Difficulty | Objective | Grader Method |
-|---|---|---|---|
-| `software-incident` | **Easy** | Fix Nginx 503 rate-limit misconfiguration | State Check: `nginx-proxy.rate_limit` |
-| `business-process-failure` | **Medium** | Resolve inventory stockout logic error | State Check: `stock_threshold` + Red Herring Penalty |
-| `cascade-system-failure` | **Hard** | Fix Postgres connection exhaustion | Multi-Step: Query Termination + Config Update |
-
-### ๐ Baseline Benchmarks
-Validated using `inference.py` (Phi-3-mini & Qwen2.5-1.5B).
-- **Software Incident**: 0.88 / 1.00
-- **Business Process Failure**: 0.72 / 1.00
-- **Cascade System Failure**: 0.48 / 1.00
-
----
-
-## ๐ง The AI Pipeline Deep-Dive
-
-### Step 1: Scenario Injection & Bootstrapping
-```python
-# The EpisodeManager receives the frontend custom scenario JSON
-# Broadcasts 'episode_start' natively over the WebSocket to synchronize the UI
-await broadcast("episode_start", {
- "scenario": active_scenario,
- "agent_a_model": settings.AGENT_A_MODEL
-})
-```
-
-### Step 2: Agent Consensus Loop
-```python
-# Agents interact sequentially. The Investigator attempts a solution
-# while the Validator challenges it. Both agents have access to dynamic system execution.
-client, model_name = model_manager.get_client(agent_id)
-stream = await client.chat.completions.create(
- model=model_name,
- messages=injected_history,
- tools=available_tools, # e.g. fix_proposer, run_terminal_command
- stream=True
-)
-```
-
-### Step 3: Fast GPU Embeddings (Similarity Evaluation)
-```python
-# Heavy CPU blocking is completely bypassed.
-# Semantic embedding computations map strictly into the Ollama GPU pipeline.
-@lru_cache(maxsize=256)
-def get_embedding(text: str) -> List[float]:
- response = httpx.post("http://localhost:11434/api/embeddings", json={
- "model": "all-minilm",
- "prompt": text
- }, timeout=60.0)
- return response.json().get("embedding", [])
-```
-
----
-
-## ๐ ๏ธ Full Technology Stack
-
-| Layer | Technology | Why |
-|---|---|---|
-| Frontend Framework | React 18 (Vite) | Lightning fast HMR, component isolation |
-| Frontend Styling | Tailwind CSS | Utility-first tactical glassmorphism |
-| Backend Framework | FastAPI | Async Python, explicit endpoint mapping |
-| Transport Layer | WebSockets | Word-by-word streaming across UI boundaries |
-| Local AI Engine | Ollama | Native device acceleration, absolute privacy |
-| Remote Provider | HuggingFace Inference API | Drop-in SaaS alternatives |
-| SSH Connectivity | Paramiko | Secure remote shell execution for Lab Nodes |
-| Data Persistence | LocalStorage & `.env` Injection | Avoids over-architected SQL constraints |
-
----
-
-## ๐ How to Run This Project (Full Step-by-Step Guide)
-
-### ๐ Prerequisites
-- Python 3.10+
-- Node.js 18+
-- [Ollama](https://ollama.com/) (installed locally for model hosting)
-- **Optional**: A remote Linux VM (Ubuntu/Kali) with SSH enabled for Lab Node mode
-
----
-
-### 1๏ธโฃ Backend Setup (FastAPI / Python)
-
-```bash
-cd backend
-
-# Create and activate virtual environment
-python -m venv venv
-# source venv/bin/activate # Linux/macOS
-venv\Scripts\activate # Windows
-
-# Install all dependencies
-pip install -r requirements.txt
-```
-
-#### Start the Backend Engine
-```bash
-# This exposes the core REST API and the WebSocket simulation tunnel
-python main.py
-```
-
----
-
-### 2๏ธโฃ Frontend Setup (React)
-
-Open a **new terminal tab**:
-
-```bash
-cd frontend
-
-# Install Node.js dependencies
-npm install
-
-# Start the Vite development server
-npm run dev
-```
-
-The application is now fully accessible at [http://localhost:5173](http://localhost:5173).
-
----
-
-### 3๏ธโฃ Pulling Models
-
-To run the simulation locally without cloud API keys, you must ensure you pull suitable reasoning models through Ollama:
-
-```bash
-ollama run qwen2.5:3b # Excellent validator logic footprint
-ollama run dolphin-llama3 # Uncensored investigative assertions
-ollama pull all-minilm # Mandatory for semantic similarity scoring
-```
-
----
-
-## ๐งช Automated Testing
-NEXUS-AI includes a comprehensive test suite to ensure environment stability and specification compliance.
-
-```bash
-# Run the OpenEnv specification validator
-python openenv_validator.py
-
-# Run unit tests for core logic
-pip install pytest
-pytest tests/
-```
-
----
-
-## ๐ค Authors
-**Developed by: Ashish Menon** & Vector
+---
+title: NEXON-AI
+emoji: ๐ก๏ธ
+colorFrom: blue
+colorTo: indigo
+sdk: docker
+app_port: 7860
+pinned: false
+---
+
+# NEXUS-AI ๐๐ก๏ธ
+### Autonomous Incident Investigation Dashboard
+
+
+
+---
+
+## ๐ What is NEXUS-AI?
+
+NEXUS is a next-generation, autonomous dual-agent environment designed to investigate and validate software incidents in real-time. Using a combination of an **Investigator** and a **Validator** agent, NEXUS autonomously forms hypotheses, executes systems tools, evaluates system behavior, and reaches strict consensus on root causes.
+
+Traditional manual debugging requires extensive context-switching and tool fatigue. NEXUS solves this through:
+1. **Dual-Agent Autonomy**: Two specialized models communicating word-by-word via WebSockets.
+2. **Dynamic Tool Execution**: Fully integrated system terminals allowing agents to run sandboxed validation scripts.
+3. **Semantic Reward Engine**: Evaluates conversational drift mathematically (using native GPU embeddings).
+
+The result: An AI "Incident Response Team" that navigates servers, traces logs, and fixes bugs identically to a human SRE.
+
+---
+
+## ๐ผ๏ธ Application Screenshots
+
+### ๐ Simulation Dashboard
+
+> The core command center. Features live agent terminals, a dual-communication consensus log, and a mathematical performance reward graph plotting investigation confidence.
+
+
+
+
+
+---
+
+## ๐๏ธ Scenario Registry & Core Settings
+
+> The system is architected for instant adaptability โ seamlessly switch LLM providers and inject custom threat models entirely through the frontend DOM.
+
+
+
+
+
+ Scenario Registry
+ A persistent LocalStorage-backed grid of tactical simulations. Users can dynamically inject custom infrastructure-specific incidents directly into the agent pipeline.
+
+
+
+ Runtime Configuration
+ Dynamically maps available locally-installed Ollama networks, allowing the user to pair models (e.g., Qwen vs Dolphin-Phi) with fully independent parameters.
+