Spaces:
Sleeping
Sleeping
File size: 1,890 Bytes
08c0cf7 26f67bb 08c0cf7 26f67bb 08c0cf7 26f67bb 08c0cf7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | name: nexus-incident-investigation
version: "1.0.0"
tags: ["openenv", "multi-agent"]
description: >
NEXUS — Multi-Agent Incident Investigation Environment.
Multiple AI agents (up to 10) collaborate to investigate real-world system incidents.
Agents can take different roles: Investigator, Validator, Forensic Analyst,
Network Engineer, System Admin, Security Architect, and Compliance Officer.
Together they identify root causes across software, business-process,
and cascade-system failure scenarios.
tasks:
- name: software-incident
description: Single-service software bug causing user-facing errors
difficulty: easy
max_steps: 8
grader: scenarios/graders/easy_grader.py
- name: business-process-failure
description: Multi-team process breakdown with misleading red-herrings
difficulty: medium
max_steps: 8
grader: scenarios/graders/medium_grader.py
- name: cascade-system-failure
description: Multi-system cascade failure with misleading logs
difficulty: hard
max_steps: 8
grader: scenarios/graders/hard_grader.py
action_space:
type: text
description: "Free-form natural language message with optional TOOL: calls"
observation_space:
type: structured
fields:
scenario_description: string
scenario_context: string
partner_message: string
tool_results: list
clues_found: list
investigation_stage: string
round: integer
available_tools: list
reward_range: [0.0, 1.0]
reward_description: >
Dynamically computed from semantic similarity of hypothesis to root-cause,
tool quality, fix correctness, and investigation efficiency.
inference_script: inference.py
entry_point: backend/main.py
docker_port: 7860
baseline_scores:
software-incident: 0.88
business-process-failure: 0.72
cascade-system-failure: 0.48
|