If Dr House did DevOps
“Tests take time. Treatment’s quicker.”
In the TV series House, a diagnostician and his team tackle medical mysteries. Every episode a patient presents with serious symptoms of unknown cause. The race is on to find a cure.
At some point during the episode the team huddle around a whiteboard. Symptoms are written up. Hypotheses are shouted out.
When they have a set of possible diagnoses — the ones House hasn’t been able to shoot down — they start a course of treatment for the most serious one. The treatment will either help, rule out the diagnosis, or as is most common in the show, uncover new information and new mysteries.
This process is called Differential Diagnosis (DDx).
Now, switch gears a bit, you’re a software engineer and you’re doing your on-call rotation. What can you learn from House?

Modern software is complex. There are often multiple clients, multiple interacting services, and likely black-box 3rd party systems. When things go wrong it’s often not clear why. This is where DDx can be useful:
- Rule out simple, common explanations.
- Gather all the data; create a list of symptoms.
- List possible causes for the collection of symptoms.
- Prioritize the list of causes, most urgent at the top.
- Treat possible causes, or rule out through additional data.
This likely sounds intuitive to experienced DevOps and SREs. It’s not uncommon for the root cause to be identified late on in an incident, or even afterwards during a postmortem. Essentially, through the act of treating symptoms, you find the cause.
I like Differential Diagnosis as a framework to formalize the investigation process, to help make decisions in a stressful situation, and to train less-experienced engineers in incident response.
Plus, it’s more fun to think you’re solving a mystery, rather than just responding to a snafu.