openops / README.md
arya89's picture
Upload folder using huggingface_hub
d02897f verified
---
title: OpenOps Incident Commander
emoji: 🚨
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
---
# OpenOps: AI Incident Commander Environment
Production incident management environment where AI agents learn to handle real-world outages.
## Overview
OpenOps simulates production incidents requiring an AI Incident Commander to:
- Investigate alerts and logs
- Identify root causes
- Execute mitigation actions (restart/rollback/scale)
- Communicate with teams and users
- Resolve incidents quickly to minimize revenue loss
## Environment Specification
### Observation Space
```python
{
"active_alerts": List[str],
"service_status": Dict[str, str],
"recent_logs": Dict[str, List[str]],
"metrics_summary": Dict[str, float],
"customer_complaints": int,
"time_elapsed": int,
"revenue_loss": float,
"teams_notified": bool,
"status_page_updated": bool,
"user_communication_sent": bool
}
```
### Action Space (21 actions)
- 0: read_alerts
- 1-4: inspect_logs_{service}
- 5-8: check_metrics_{service}
- 9-12: restart_{service}
- 13-14: rollback_{service}
- 15-16: scale_{service}
- 17-19: Communication actions
- 20: resolve_incident
### Three Tasks
**Task 1 (Easy): Simple API Crash**
- API service down due to OOM
- Solution: Inspect logs β†’ Restart API
**Task 2 (Medium): Bad Deployment**
- Database deployment broke queries
- Solution: Inspect logs β†’ Rollback deployment β†’ Notify team
**Task 3 (Hard): Cascading Failure**
- Database overload β†’ API timeouts β†’ Customer impact
- Solution: Inspect both services β†’ Scale database β†’ Restart API β†’ Communicate
## Installation
```bash
pip install -r server/requirements.txt
```
## Usage
### Run Locally
```bash
cd server
uvicorn app:app --reload
```
### Run Inference
```bash
export OPENAI_API_KEY="your-key"
python inference.py
```
## Grading
Each task scored 0.0-1.0 based on:
- Investigation quality
- Correct mitigation actions
- Communication
- Successful resolution
## Deployment
Deploy to HuggingFace Spaces:
```bash
openenv push
```
## πŸ“Š Sample Output
![local inference outputt](https://github.com/arya892004/OpenOps/blob/main/assets/output.png?raw=true)