Root Cause Analysis for Managers: Stop Solving the Same Problems on Repeat


Team gathered around a whiteboard analyzing a process diagram

Early in my IT operations career, I inherited a team that filed the same three service tickets every month. Same failure mode, same affected system, same temporary workaround logged in the same incident template. My predecessor had been “fixing” these issues for over a year. Each time, the fix looked successful because the ticket got closed. The problem never went away because nobody had asked what was actually causing it.

When I finally pulled the team together and spent 45 minutes walking through the chain of events, we traced all three recurring tickets back to a single misconfigured load balancer that had been set up incorrectly during a migration 18 months earlier. A 20-minute configuration change eliminated all three issues permanently.

That experience taught me something I have carried through two decades of operations leadership: the time you invest in finding the real cause of a problem always costs less than the time you spend fixing its symptoms on rotation.

The Diagnosis Problem in Management

Harvard Business School research found that 85% of executives believe their organizations are bad at diagnosing problems, and 87% believe that failure carries significant costs. The numbers line up with what most operations managers already feel: teams spend enormous energy reacting to the same breakdowns, the same quality issues, and the same process failures quarter after quarter.

Root cause analysis is the discipline of stopping that cycle. Instead of asking “what happened?” and jumping to a fix, you ask “why did this happen?” repeatedly until you reach the underlying condition that produced the visible problem. The concept originated with Sakichi Toyoda in the 1930s, refined by Taiichi Ohno as a core practice of the Toyota Production System. Ohno described it as the basis of Toyota’s scientific approach to manufacturing, and his phrase (“repeat ‘why’ five times”) became one of the most enduring principles in operations management.

The good news for managers: you do not need a Six Sigma black belt to use RCA effectively. You need structured curiosity, 30 to 60 minutes, and a willingness to follow the evidence past your first assumption.

Three Methods That Work for Management Teams

The 5 Whys

The simplest and most accessible RCA tool. You state the problem, then ask “why?” five times (or however many it takes) to peel back layers until you reach something systemic.

Example from an operations team:

  • Problem: The monthly report was delivered two days late.
  • Why? The data analyst didn’t get the raw data until Thursday.
  • Why? The sales team submitted their numbers late.
  • Why? The sales team wasn’t aware the deadline had moved up.
  • Why? The schedule change was communicated in a Slack channel they don’t monitor.
  • Root cause: Critical process changes are being announced through informal channels instead of the team’s established communication protocol.

The fix is not “remind the sales team about deadlines.” The fix is establishing a single, reliable channel for process change announcements and confirming receipt. That addresses every future schedule change, not just this one.

The 5 Whys works best when the problem is relatively contained and the team has direct knowledge of the events. For problems with multiple contributing causes, you need something that can branch.

The Fishbone (Ishikawa) Diagram

Kaoru Ishikawa developed this tool in the 1960s during Japan’s quality revolution. You draw the problem at the head of a horizontal line, then branch out into categories of potential causes. The standard categories (People, Process, Technology, Environment, Measurement, Materials) give your team a systematic way to brainstorm causes without getting tunnel vision on the first explanation someone offers.

This method is particularly useful when a problem could stem from multiple sources and you need the team to think broadly before narrowing down. I have used fishbone diagrams in process improvement work when the obvious answer (“the software is broken” or “the new hire messed up”) turned out to be masking a deeper process gap.

The key to making fishbone sessions productive: write every proposed cause on the board without evaluating it first. Let the team generate the full picture before testing which branches lead to the actual root.

The Iceberg Model

The Harvard Business School problem-solving framework describes four layers of analysis: visible events at the surface, behavioral patterns beneath those events, systemic structures beneath the patterns, and mental models at the base.

This approach works best for recurring organizational problems where the root cause is cultural or structural rather than mechanical. If your team keeps missing deadlines, and the 5 Whys keeps landing on “people feel uncomfortable pushing back on unrealistic requests,” you are looking at a mental model problem (the belief that saying no equals disloyalty) that no process change alone will solve.

In my fractional COO work through Ops Harmony, the iceberg model has been the most useful tool for clients whose problems keep returning despite repeated process fixes. The process is rarely the deepest issue. The incentives, assumptions, or unspoken norms around the process usually are.

Running an RCA Session With Your Team

An RCA session is not a meeting where you diagnose the problem and hand down the answer. It is a structured conversation where the people closest to the work trace the failure to its source. Here is what makes these sessions productive.

Set the scope before you start. Define the specific problem in one sentence. “Project deliverables are consistently late” is too broad. “The Q1 client report missed its delivery date by four business days” gives the team something concrete to trace.

Bring the people who touched the work. RCA requires firsthand knowledge. If the problem spans multiple teams, get a representative from each. The after-action review format works well as a companion practice for structuring who contributes what.

Separate cause-finding from solution-finding. The most common failure mode in RCA sessions is jumping to solutions at the first plausible cause. Discipline the group to exhaust the “why” chain before proposing fixes. A useful phrase: “We’re still in diagnosis. Let’s hold solutions until we agree on the cause.”

Document the chain, not just the conclusion. Write down every step of the analysis. Six months from now, when a similar problem surfaces, that documentation will tell you whether you are looking at the same root cause or a different one. This also feeds into your team’s process documentation and reduces the rework that comes from solving problems without recording the reasoning.

Assign the fix to a specific person with a specific date. RCA that ends with “we should improve our communication” accomplishes nothing. RCA that ends with “update the project intake template to include a communication plan field, owner: project lead, due: end of month” has a chance of sticking.

Where Most Managers Get the Analysis Wrong

Stopping at the human error layer. “Someone made a mistake” is almost never the root cause. People make mistakes when systems allow them to. If an employee entered data in the wrong field, the question is why the system accepted that entry, why there was no validation, or why the training didn’t cover that scenario. Blaming the individual closes the investigation before it reaches anything systemic. It also destroys the psychological safety your team needs to participate honestly in future analyses.

Treating RCA as a postmortem ritual instead of a regular practice. Many teams only do root cause analysis after catastrophic failures. By then, the pressure to assign blame overwhelms the search for systemic causes. The most effective operations teams I have worked with run lightweight RCA on small, frequent problems: a handoff that dropped, a meeting that produced no decisions, a deliverable that needed two rounds of revision. Small analyses build the muscle. Finding process bottlenecks before they escalate is the same principle applied to prevention.

Accepting the first root cause that sounds reasonable. Confirmation bias is real, especially in groups. The first person to offer a plausible explanation often anchors the entire discussion. Counter this by asking the team: “What else could explain this?” at least twice before settling on a root cause. Organizations that adopt systematic, structured RCA processes report 40 to 60% fewer recurring quality issues compared to those that rely on informal troubleshooting.

Building the Habit

Root cause analysis is not a special event. It is a management habit. The operations teams that get the most from it build a simple trigger: any problem that recurs more than twice gets a 30-minute RCA session before anyone attempts another fix.

That single rule, applied consistently, changed the trajectory of every team I managed after I learned it the hard way with those three recurring service tickets. The problems that eat your team’s time are rarely mysteries. They are symptoms waiting for someone to trace them back to a cause that can actually be fixed.

Ty Sutherland

Ty Sutherland is an operations and technology leader with 20+ years of experience. He is Director of IT Operations at SaskTel, founder of Ops Harmony (fractional COO and EOS Integrator), and former COO at WTFast. He writes Management Skills Daily to share practical management frameworks that work in the real world.

Recent Posts