AegisLM Incident Command Protocol
Overview
This document defines the incident command protocol for AegisLM operations, establishing clear roles and procedures during incident response.
Incident Command Structure
Command Roles
| Role |
Responsibility |
Authority |
| Incident Commander (IC) |
Overall response coordination |
Full incident authority |
| Operations Lead |
Technical response |
Deploy fixes |
| Communications Lead |
Stakeholder updates |
Public communications |
| Liaison |
External coordination |
Partner communications |
| Safety Officer |
Safety of response team |
Stop unsafe actions |
Incident Phases
1. Detection & Assessment
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DETECTION & ASSESSMENT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. ALERT RECEIVED β
β βββ Automated alert (monitoring) β
β βββ Manual report (user/staff) β
β βββ Security alert (SIEM) β
β β
β 2. INITIAL ASSESSMENT β
β βββ Confirm incident validity β
β βββ Determine scope and severity β
β βββ Identify affected systems β
β β
β 3. INCIDENT DECLARATION β
β βββ Declare incident (if confirmed) β
β βββ Activate incident response β
β βββ Notify incident commander β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2. Response & Containment
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RESPONSE & CONTAINMENT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. CONTAINMENT β
β βββ Isolate affected systems β
β βββ Block malicious activity β
β βββ Preserve evidence β
β β
β 2. ERADICATION β
β βββ Remove threat β
β βββ Patch vulnerabilities β
β βββ Reset compromised credentials β
β β
β 3. RECOVERY β
β βββ Restore services β
β βββ Verify system integrity β
β βββ Resume operations β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3. Post-Incident
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β POST-INCIDENT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. LESSONS LEARNED β
β βββ What happened β
β βββ How we responded β
β βββ What we can improve β
β β
β 2. DOCUMENTATION β
β βββ Timeline of events β
β βββ Actions taken β
β βββ Evidence collected β
β β
β 3. PROCESS IMPROVEMENT β
β βββ Update runbooks β
β βββ Enhance detection β
β βββ Improve response β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Severity Levels
| Severity |
Definition |
Examples |
Response Time |
| SEV1 - Critical |
Complete service loss, data breach |
Full outage, exfiltration |
15 min |
| SEV2 - High |
Major feature broken |
API down, certification failure |
1 hour |
| SEV3 - Medium |
Feature degraded |
Slow response, partial outage |
4 hours |
| SEV4 - Low |
Minor issue |
UI bug, documentation error |
24 hours |
Communication Protocol
Internal Communication
| Stage |
Channel |
Audience |
Timing |
| Detection |
PagerDuty |
On-call |
Immediate |
| Declaration |
Slack #incidents |
Response team |
15 min |
| Updates |
Slack #incidents |
All hands |
Hourly |
| Resolution |
Slack #incidents |
All hands |
On resolution |
External Communication
| Stage |
Channel |
Audience |
Approval |
| Initial |
Status page |
Public |
IC only |
| Updates |
Status page |
Public |
IC + Comms |
| Post-Incident |
Blog/Report |
Public |
Advisory Board |
Runbook Integration
Common Incident Runbooks
| Incident Type |
Runbook Location |
Status |
| API Outage |
runbooks/api-outage.md |
β Complete |
| Database Failure |
runbooks/db-failure.md |
β Complete |
| Security Breach |
runbooks/security-breach.md |
β Complete |
| Certification Error |
runbooks/cert-error.md |
β Complete |
| Data Loss |
runbooks/data-loss.md |
β Complete |
Version Information
| Item |
Version |
Date |
| Incident Command Protocol |
1.0 |
January 15, 2025 |
This protocol is maintained by the Operations team and reviewed quarterly.