Info Tech Incident Post-Mortem Theater

Thanks for following along through our six-part AI Cost Ripples series. In this article, we’re switching gears to something different.

In my TPM career, I’ve attended countless post-mortem meetings and exercises. Each one has been a learning experience, almost like a living environment with constantly changing variables. Yet, within that dynamic setting, certain rules remain surprisingly rigid. I’ve outlined some of these, along with details and examples, below.

“Incident Post-Mortem Theater” is a creative framework used in our software industry to transform routine post-incident reviews in engaging, transparent learning sessions. It blends the structured discipline of a traditional IT post-mortem with many narrative, reflection, and human elements of a theatrical performance.

Here is how an Incident Post-Mortem Theater session can be designed:

A Play Based on a Particular Incident (The Reenactment)

Move beyond a dry timeline review. Use theatrical framing to humanize that incident. Responders become “actors,” narrating their experience as a story.

Cast of Characters:
Roles such as Incident Commander, On-Call Engineer, Customer Support, “System Itself,” or “Rogue Deploy.”
Screenplay Timeline:
Transform your technical timeline into a narrative script with timestamps, actions, observations, and beliefs captured at that moment.
Example:

Timestamp: 14:02 UTC
Character: On-Call Engineer
Action: Noticed a spike in error rates via Datadog dashboard.
Belief: Suspected a caching issue triggered by recent marketing traffic.

Stage Directions & Setting:
Describe your environment, including “props” such as monitoring tools, runbooks, Slack channels, plus any contributing conditions for example “On-call engineer on bridge while fielding an unrelated page”.
Plot Points:

Lead-Up: Circumstances prior to that incident.
Climax: Maximum impact or discovery moment.
Turning Point: Action that drove mitigation.
Final Resolution: Service restored.

A Blameless Review (The Critique)

Following this “performance,” your larger project team becomes the audience in a blameless critique focused on systems, not individuals.

Set the Stage for Safety:
A facilitator (your “Director”) reinforces psychological safety and no-blame culture.
Blameless Discussion Prompts:

What went well?
Where did we get lucky?
What could have been more successful?
How did we not detect this sooner?
Why did our process allow this? (Use “5 Whys” or similar techniques.)

Actionable Outcomes (Your “Next Season” Plan)

Insights must translate into clear improvements.

Call Sheet for Action Items:
Create concise, prioritized, trackable tasks with owners and deadlines.
Example:

Action Item XYZ: Update runbook for API latency issues
Owner: @engineer_name
Due: Nov 30

Documentation & Sharing:
Publish a final post-mortem document/report, almost like a playbill, summarizing your story, timeline, root causes, and planned improvements. Circulate it widely to drive organizational learning.
Awards Ceremony:
Celebrate responders and highlight positive contributions for cultural reinforcement.

Using this framework turns a necessary yet often tedious process into a more engaging, insightful, and effective learning experience.

Don’t forget to subscribe to get notifications for new articles

Info Tech Incident Post-Mortem Theater

Leave a Comment Cancel Reply

Stay ahead in tech & leadership—subscribe for bite-sized insights, expert tips, and industry updates!

Related Posts

Leave a Comment Cancel Reply