---
title: NEXON-AI
emoji: ๐ก๏ธ
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
---
# NEXUS-AI ๐๐ก๏ธ
### Autonomous Incident Investigation Dashboard





**Status:** Active Simulation Pipeline
**Architecture:** Real-time WebSockets + Multi-Agent Consensus
---
## ๐ What is NEXUS-AI?
NEXUS is a next-generation, autonomous dual-agent environment designed to investigate and validate software incidents in real-time. Using a combination of an **Investigator** and a **Validator** agent, NEXUS autonomously forms hypotheses, executes systems tools, evaluates system behavior, and reaches strict consensus on root causes.
Traditional manual debugging requires extensive context-switching and tool fatigue. NEXUS solves this through:
1. **Dual-Agent Autonomy**: Two specialized models communicating word-by-word via WebSockets.
2. **Dynamic Tool Execution**: Fully integrated system terminals allowing agents to run sandboxed validation scripts.
3. **Semantic Reward Engine**: Evaluates conversational drift mathematically (using native GPU embeddings).
The result: An AI "Incident Response Team" that navigates servers, traces logs, and fixes bugs identically to a human SRE.
---
## ๐ผ๏ธ Application Screenshots
### ๐ Simulation Dashboard
> The core command center. Features live agent terminals, a dual-communication consensus log, and a mathematical performance reward graph plotting investigation confidence.
---
## ๐๏ธ Scenario Registry & Core Settings
> The system is architected for instant adaptability โ seamlessly switch LLM providers and inject custom threat models entirely through the frontend DOM.
Scenario Registry
A persistent LocalStorage-backed grid of tactical simulations. Users can dynamically inject custom infrastructure-specific incidents directly into the agent pipeline.
|
Runtime Configuration
Dynamically maps available locally-installed Ollama networks, allowing the user to pair models (e.g., Qwen vs Dolphin-Phi) with fully independent parameters.
|
---
## ๐๏ธ System Architecture
```text
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CLIENT BROWSER โ
โ React SPA (Tailwind + Framer Motion) โ
โ localhost:5173 โ
โโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
โ HTTP (REST) โ ws://
โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FASTAPI BACKEND (localhost:7860) โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
โ โ /config โ โ/scenariosโ โ /reset โ โ ws:// Simulator โ โ
โ โ Env Sync โ โ DB Cache โ โ Injectionโ โ Live Stream Syncโ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OLLAMA ENGINE / LLM PIPELINE โ
โ Agent A (Investigator) โโโโโโโโบ Agent B (Validator) โ
โ - Generates Hypotheses - Challenges Assertions โ
โ - Runs System Tools - Requires Proof โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
---
## ๐ Execution Environments
NEXUS-AI supports two distinct execution models for agent tools, toggleable via the **Settings** dashboard:
### 1. Simulated Mode (Safe Sandbox)
* **Default Mode**: Agents interact with a pre-defined `clue_map` within the scenario YAML.
* **No System Impact**: Commands like `read_logs` or `check_service` return mocked data.
* **Use Case**: Training, logic validation, and "what-if" analysis without infrastructure risk.
### 2. SSH Lab Node (Real-World Execution)
* **Live Connection**: Commands are executed in real-time on a remote Linux server via SSH.
* **Autonomous Terminal**: Agents use the `run_terminal_command` tool to browse logs, check systemd status, and inspect real configs.
* **Security**: Includes a command blocklist to prevent highly destructive operations (e.g., `rm -rf /`).
* **Use Case**: Actual incident response on isolated Lab/Staging nodes.
---
## ๐ OpenEnv Specification
NEXUS-AI strictly adheres to the **OpenEnv 1.0** standard for agent-environment interaction.
### ๐ฎ Action Space
The environment accepts a typed **NexusAction** (Text-based with structured tool calls).
- **agent_id**: `string` ("agent_a" or "agent_b")
- **message**: `string` (The natural language reasoning/communication)
- **tool_calls**: `List[ToolCall]` (Optional structured calls like `TOOL: read_logs(file='app.log')`)
- **confidence**: `float` (0.0 - 1.0)
### ๐ง Observation Space
The environment returns a structured **NexusObservation** summarizing the system state.
- **scenario_description**: `string` (High-level objective)
- **scenario_context**: `string` (Background telemetry/environment info)
- **partner_message**: `string` (The last message from the other agent)
- **tool_results**: `List[ToolResult]` (Output of any executed system tools)
- **clues_found**: `List[string]` (Accumulated evidence identified by the Reward Engine)
- **investigation_stage**: `string` (`investigating`, `narrowing`, `found`, `verified`)
- **round**: `integer` (Current episode round)
- **available_tools**: `List[string]` (List of permitted tools for the current mode)
### ๐ Task Registry & Difficulty
| Task Name | Difficulty | Objective | Grader Method |
|---|---|---|---|
| `software-incident` | **Easy** | Fix Nginx 503 rate-limit misconfiguration | State Check: `nginx-proxy.rate_limit` |
| `business-process-failure` | **Medium** | Resolve inventory stockout logic error | State Check: `stock_threshold` + Red Herring Penalty |
| `cascade-system-failure` | **Hard** | Fix Postgres connection exhaustion | Multi-Step: Query Termination + Config Update |
### ๐ Baseline Benchmarks
Validated using `inference.py` (Phi-3-mini & Qwen2.5-1.5B).
- **Software Incident**: 0.88 / 1.00
- **Business Process Failure**: 0.72 / 1.00
- **Cascade System Failure**: 0.48 / 1.00
---
## ๐ง The AI Pipeline Deep-Dive
### Step 1: Scenario Injection & Bootstrapping
```python
# The EpisodeManager receives the frontend custom scenario JSON
# Broadcasts 'episode_start' natively over the WebSocket to synchronize the UI
await broadcast("episode_start", {
"scenario": active_scenario,
"agent_a_model": settings.AGENT_A_MODEL
})
```
### Step 2: Agent Consensus Loop
```python
# Agents interact sequentially. The Investigator attempts a solution
# while the Validator challenges it. Both agents have access to dynamic system execution.
client, model_name = model_manager.get_client(agent_id)
stream = await client.chat.completions.create(
model=model_name,
messages=injected_history,
tools=available_tools, # e.g. fix_proposer, run_terminal_command
stream=True
)
```
### Step 3: Fast GPU Embeddings (Similarity Evaluation)
```python
# Heavy CPU blocking is completely bypassed.
# Semantic embedding computations map strictly into the Ollama GPU pipeline.
@lru_cache(maxsize=256)
def get_embedding(text: str) -> List[float]:
response = httpx.post("http://localhost:11434/api/embeddings", json={
"model": "all-minilm",
"prompt": text
}, timeout=60.0)
return response.json().get("embedding", [])
```
---
## ๐ ๏ธ Full Technology Stack
| Layer | Technology | Why |
|---|---|---|
| Frontend Framework | React 18 (Vite) | Lightning fast HMR, component isolation |
| Frontend Styling | Tailwind CSS | Utility-first tactical glassmorphism |
| Backend Framework | FastAPI | Async Python, explicit endpoint mapping |
| Transport Layer | WebSockets | Word-by-word streaming across UI boundaries |
| Local AI Engine | Ollama | Native device acceleration, absolute privacy |
| Remote Provider | HuggingFace Inference API | Drop-in SaaS alternatives |
| SSH Connectivity | Paramiko | Secure remote shell execution for Lab Nodes |
| Data Persistence | LocalStorage & `.env` Injection | Avoids over-architected SQL constraints |
---
## ๐ How to Run This Project (Full Step-by-Step Guide)
### ๐ Prerequisites
- Python 3.10+
- Node.js 18+
- [Ollama](https://ollama.com/) (installed locally for model hosting)
- **Optional**: A remote Linux VM (Ubuntu/Kali) with SSH enabled for Lab Node mode
---
### 1๏ธโฃ Backend Setup (FastAPI / Python)
```bash
cd backend
# Create and activate virtual environment
python -m venv venv
# source venv/bin/activate # Linux/macOS
venv\Scripts\activate # Windows
# Install all dependencies
pip install -r requirements.txt
```
#### Start the Backend Engine
```bash
# This exposes the core REST API and the WebSocket simulation tunnel
python main.py
```
---
### 2๏ธโฃ Frontend Setup (React)
Open a **new terminal tab**:
```bash
cd frontend
# Install Node.js dependencies
npm install
# Start the Vite development server
npm run dev
```
The application is now fully accessible at [http://localhost:5173](http://localhost:5173).
---
### 3๏ธโฃ Pulling Models
To run the simulation locally without cloud API keys, you must ensure you pull suitable reasoning models through Ollama:
```bash
ollama run qwen2.5:3b # Excellent validator logic footprint
ollama run dolphin-llama3 # Uncensored investigative assertions
ollama pull all-minilm # Mandatory for semantic similarity scoring
```
---
## ๐งช Automated Testing
NEXUS-AI includes a comprehensive test suite to ensure environment stability and specification compliance.
```bash
# Run the OpenEnv specification validator
python openenv_validator.py
# Run unit tests for core logic
pip install pytest
pytest tests/
```
---
## ๐ค Authors
**Developed by: Ashish Menon** & Vector