shaikhsalman
/

devsecops-platform

Model card Files Files and versions

devsecops-platform / infrastructure /postmortem-template.md

shaikhsalman's picture

refactor: merged structure - model at center, DevSecOps wrapped around it

9d4d5c7 verified 19 days ago

|

history blame contribute delete

1.56 kB

Post-Mortem: [INCIDENT TITLE]

Metadata

Incident ID: INC-XXXX
Severity: P1/P2/P3
Date: YYYY-MM-DD
Duration: X hours Y minutes
Start Time: HH:MM UTC
End Time: HH:MM UTC
Authors: @engineer1, @engineer2
Status: Draft/Final

Executive Summary

[1-2 sentences: what happened, customer impact, duration]

Impact

Customers affected: X / Y (Z%)
Requests failed: X
Revenue impact: $X
Error budget consumed: X% of 30d budget

Timeline (UTC)

Time	Event	Action
HH:MM	Alert fired	On-call paged
HH:MM	Root cause identified	[What was found]
HH:MM	Mitigation applied	[What was done]
HH:MM	Service restored	[Confirmation]
HH:MM	All-clear	Incident closed

Root Cause

[5 Whys analysis]

Why did the incident occur?
Why was that condition present?
Why was that not caught?
Why was there no automated prevention?
Why was this not in our risk model?

What Went Well

[Detection was fast / alert was clear / etc.]

What Went Wrong

[Response was slow / runbook was missing / etc.]

Action Items

#	Action	Owner	Priority	Due Date	Type
1	[Fix]	@eng	P1	YYYY-MM-DD	Remediate
2	[Prevent]	@eng	P2	YYYY-MM-DD	Automate
3	[Detect]	@eng	P2	YYYY-MM-DD	Monitoring

Lessons Learned

[Key takeaway 1]
[Key takeaway 2]

Appendices

Grafana dashboard screenshots
Alert screenshots
Log excerpts