![]()
Thanks for following along through our six-part AI Cost Ripples series. In this article, we’re switching gears to something different.
In my TPM career, I’ve attended countless post-mortem meetings and exercises. Each one has been a learning experience, almost like a living environment with constantly changing variables. Yet, within that dynamic setting, certain rules remain surprisingly rigid. I’ve outlined some of these, along with details and examples, below.
“Incident Post-Mortem Theater” is a creative framework used in our software industry to transform routine post-incident reviews in engaging, transparent learning sessions. It blends the structured discipline of a traditional IT post-mortem with many narrative, reflection, and human elements of a theatrical performance.
Here is how an Incident Post-Mortem Theater session can be designed:
A Play Based on a Particular Incident (The Reenactment)
Move beyond a dry timeline review. Use theatrical framing to humanize that incident. Responders become “actors,” narrating their experience as a story.
- Cast of Characters:
Roles such as Incident Commander, On-Call Engineer, Customer Support, “System Itself,” or “Rogue Deploy.” - Screenplay Timeline:
Transform your technical timeline into a narrative script with timestamps, actions, observations, and beliefs captured at that moment.
Example:
- Timestamp: 14:02 UTC
- Character: On-Call Engineer
- Action: Noticed a spike in error rates via Datadog dashboard.
- Belief: Suspected a caching issue triggered by recent marketing traffic.
- Stage Directions & Setting:
Describe your environment, including “props” such as monitoring tools, runbooks, Slack channels, plus any contributing conditions for example “On-call engineer on bridge while fielding an unrelated page”. - Plot Points:
- Lead-Up: Circumstances prior to that incident.
- Climax: Maximum impact or discovery moment.
- Turning Point: Action that drove mitigation.
- Final Resolution: Service restored.
A Blameless Review (The Critique)
Following this “performance,” your larger project team becomes the audience in a blameless critique focused on systems, not individuals.
- Set the Stage for Safety:
A facilitator (your “Director”) reinforces psychological safety and no-blame culture. - Blameless Discussion Prompts:
- What went well?
- Where did we get lucky?
- What could have been more successful?
- How did we not detect this sooner?
- Why did our process allow this? (Use “5 Whys” or similar techniques.)
Actionable Outcomes (Your “Next Season” Plan)
Insights must translate into clear improvements.
- Call Sheet for Action Items:
Create concise, prioritized, trackable tasks with owners and deadlines.
Example:
- Action Item XYZ: Update runbook for API latency issues
- Owner: @engineer_name
- Due: Nov 30
- Documentation & Sharing:
Publish a final post-mortem document/report, almost like a playbill, summarizing your story, timeline, root causes, and planned improvements. Circulate it widely to drive organizational learning. - Awards Ceremony:
Celebrate responders and highlight positive contributions for cultural reinforcement.
Using this framework turns a necessary yet often tedious process into a more engaging, insightful, and effective learning experience.
Don’t forget to subscribe to get notifications for new articles

