Category Archives: Root Cause Analysis


Creating a common reality is a part of the foundation of the Apollo Root Cause Analysis methodology.  It is important that language and definitions are consistent among all parties involved. When the Apollo Root Cause Analysis methodology is applied correctly everyone who participates truly understands the value of the problem, what the solutions are and how they will affect the problem.

Establishing a universal reality is a bigger challenge than you might think. No one shares the exact same experiences or interprets information in the exact same manner. Good problem solvers know to take these different perspectives into account as they forge a path to the solutions.

Just as individuals apply their own unique perspective when conducting specific RCAs, companies apply their unique organizational culture when implementing an RCA process.  Establishing company standards by defining an RCA champion with clear expectations and implementation procedures in place will keep your organization on the path to RCA success.

Another way to stay at the top of your game is to learn from the experiences of industry peers.  Here we take a look at a conversation between Tom, an Engineering Team Lead and RCA champion and Jack, an expert Apollo Root Cause Analysis methodology instructor. 

Tom (Engineering Team Lead): 

I have found that sometimes engineers and technicians do not have a real understanding of the meaning of “root cause.” They tend to think of it as a single poor design feature or failure like a “loose nut” or a single cause of the issue or failure. They seemed to be surprised when I recently identified ten root causes on the last job. They were confused and could not get their heads around having ten root causes. They said, “But what was the real single root cause?”

Jack (RCA Instructor):

You are so right. Many people have preconceived idea that there can only be one root cause. They are driven by this perception to that end. It is quite a limiting concept for those people. They can become quite tunneled in their thinking, offering a close-minded approach to their problems rather than an all-embracing search for knowledge and information that could lead to enlightenment. Some anecdotal information even suggests that this mind frame is taught and it quite difficult to rattle their cages and try to shift their paradigms. How do you define root cause?


I define root cause as an opportunity for improvement. A single root cause cannot exist on its’ own, there must also be at least one condition. Here, I cannot come across as too much of a know-it-all or people roll their eyes, so I need a quick snappy go to response that is quick and brief and simple and does not come across as a nerd or a geek. That’s just where I work, as there are no formal RCA people in this division – we all share the work on investigations and most are engineering failure investigations that I do out of my own volition, and share with my team.  In your experience, what are the major setbacks you have seen with people applying the RCA process? I’d like to get better and avoid these mistakes.


You are doing a great job, persevere. Changing peoples’ perspectives takes time especially if you are the only one flying the flag. A major key to success is making sure you are asking enough questions and following a process that demands these questions be asked. Sometimes people take shortcuts to speed up the process…less to think about…less time…must be better! And they can still argue that they have a solution. For simple problems this may even work and they could achieve a satisfactory result, but for complex problems this approach simply doesn’t come close to being comprehensive enough. The lack of knowledge and training in this area now comes back to bite them and their problems invariably don’t go away. Without a solid RCA foundation and process in place the structures within the company they work for won’t raise any red flags that something may be incorrect or ineffective in any way….so the end product of a subpar RCA (the report) is accepted.  If management doesn’t embrace the change then reverting to old acceptable habits is just easier. The key to avoiding these major failures lies in overcoming the resistance to change.  Involving your team in the RCA process and sharing your successes with management is a great way to gain support.


I got into the habit of now actually doing an initial draft RCA live in front of my team. I draft the RCA in a bound book which I have dedicated to this purpose and follow the cause and effect pathways like the software. I feel like this approach is more relatable with my team and I am able to get their input quickly. We are usually able to identify half a dozen possible causes in just a few minutes.  Afterwards I go to the software and expand on it. Then I formalize and save the RCA in the software which checks all my work.  

Hope you are in Sydney sometime soon, Jack. Your teaching techniques really work and I liked your style. I think in 20 years of taking training your lessons are the ones that have stuck the most with me.


If you have linkedin_banner.jpgquestions or ideas to share and would like to connect with people who have been trained in the Apollo Root Cause Analysis methodology with ARMS Reliability join our Apollo Root Cause Analysis methodology discussion group on LinkedIn. 



Click on the infographic for a PDF version. 


By Kevin Stewart

Many of the Apollo Root Cause Analysis methodology training instructors often get asked the same question – “how long should it take to do a Root Cause Analysis (RCA) investigation?”  This is a difficult question to answer due to the variables associated with each individual RCA.   It’s a lot like asking someone, “How long will the trip take?”  How do you begin to answer that? Some questions that come to mind are – to where? Or how will you be traveling? Or what route will you take?  Or will you be stopping anywhere? And so on.

If it is so variable, how can we even talk about whether an RCA should take several days or not?  There are two general paths in the utilization of the Apollo methodology, let’s call them “long” and “short.” Since this article is about RCAs not taking several days, let’s focus on the short one. bigstock-Calendar-And-Clock-Time-Circle-83476289.jpg

Most people envision the Apollo Root Cause Analysis methodology as a large group of people in a conference room for several days as a necessary means to finding a valid solution.  It is true that many RCA investigations do take four to five solid, eight-hour days to determine an appropriate solution, but these should be problems that have a large significance where information may not be readily available.

I always point out to my students that not only is it possible to do an Apollo Root Cause Analysis in a short time, but I have personally done several that took less than a day.  How?

The Apollo Root Cause Analyisis process involves a specific methodology of asking “why?” or “caused by ____?” and then identifying an appropriate answer, writing it down, and then asking “why” again.  You do this until you are stymied with no answers or reach a point where it doesn’t make sense to ask “why” anymore.  This process does not change regardless of the type or the size of the problem, or for any other reason.

Many of you may have heard of the “Five Whys” as an RCA process.  This was designed for small problems experienced by operators on the line at Toyota facilities.  These little RCAs were done in the moment by people involved in the incident.  If you’re familiar with both the Apollo Root Cause Analysis methodology and Five Whys process you may notice that they are very similar. Many times I point out to students that you can see several “Five Whys” branches inside any Apollo RCA chart. So it stands to reason that the Apollo Root Cause Analysis methodology can be used in a similar fashion to the Five Whys.

Here’s an example.  I was responsible for the reliability of a production area of a plant during my career.  It was not uncommon to find me walking around looking for problems, and during one such time I discovered some people working hard to unplug a jammed conveyor.  It was plugged with a 1,000-pound solid carbon block wedged in between some posts, and there was no good access to the block with a crane or other lifting device.  When they spotted me I got an earful; apparently this had been happening on a regular basis.  The specific frequency was unknown, but the emotion of the operator told me that it was at least once per shift.  I promised to fix it for him and he calmed down, they got the unit unplugged and back on line, and he went back to his job just downstream of the jam.

Since I promised to fix this, I decided to spend some time at the unit to see if I could observe what was causing the jam.

The Apollo Root Cause Analysis process went like this:


If you start the RCA chart in your mind, you quickly get to a dead end because no one could see why the jam had happened.  The operator in the area was busy doing his job, which required constant attention—pouring molten metal into a small cavity to “glue” a copper rod to the top of an anode.   This was done while the line was moving; he poured one about every 15 seconds so he really couldn’t be looking around.  There were not a lot of other spare personnel in the area that could spend the time looking, so I decided that was my job. bigstock-Man-in-a-safety-hat-taking-not-64077667_Resized.jpg

These blocks where pushed onto an automated system by a large pusher that had a paddle hanging down from a cylindrical steel piece with a bushing, since the paddle was designed to float.  It seemed pretty obvious that the pusher had something to do with it… but how?  After they started up the system, it worked like a charm just as designed, no glitches.  Intermittent problems are some of the hardest to fix because you need to be there when things go awry or gather data to identify the causes.

So there I was with one cause on my box – “Block jammed caused by ____?” I thought perhaps if I watched it I’d get lucky enough to catch the issue.  So I stood there, and stood there, and stood there for perhaps an hour. Nothing.  I didn’t want to leave quite yet but it did seem like a waste of time, so I decided to check out other items in the area.  I spent an hour or so away from the machine and then went back. Upon returning to the unit there didn’t seem to be anything obviously out of order.  However, something seemed different, though I couldn’t put my finger on it.

After spending another hour away and then coming back again, this time I noticed what appeared to be a difference: slight, but I was pretty sure it was happening.  One more hour away and then back and sure enough something was happening over a long period of time.

Now I just needed to verify my suspicions.  Believing I knew the cause, I figured I had enough time to go to lunch and do some more office work before returning to the unit to check my theory and gather evidence.  I was correct.

The cause of the issue was that the paddle was rotating counter-clockwise on the shaft ever so slightly with every push.  It was taking more than six hours for it to rotate enough to push on the corner of the block, shove it sideways off the conveyor, and cause the jam.  So my chart looked like this after about six to seven hours:

chart_1.pngAt this point I alerted everyone to the issue, and the maintenance personnel came over and safely moved the paddle back so the shift could finish.  Our facility had a swing shift crew that worked in the area after the production was done, so they were assigned the task of fixing the unit.

That evening they removed the unit, checked everything against the drawings and specifications, and found that the tolerance on the bushing was incorrect.  It was close, but the tolerance was tight enough that each push that was not exactly dead-on caused a slight twisting force, moving the paddle off course and eventually causing a jam. The team fixed the tolerance issue and put it back in place by the next shift start.

So my chart now looked like this:


This whole process took less than eight hours to complete but was spread out over two days.  If you look at my total time involvement it was perhaps four hours. (I am not charging the process with time that I was multitasking by doing other things.)

So as you can see, an RCA investigation doesn’t always have to take days.  Of course, some will take several days and you could stretch even a simple investigation into a longer process if you wish. But if you are close to the problem, get accurate information, act quickly, and stick with the process, you can do an RCA quickly and get an effective solution.


We occasionally receive questions to clarify the difference between our Incident Investigation training course and our RCA Facilitator course so we thought we would address some of the most commonly asked questions in this Q&A-style article. We hope you find it helpful. And if you have any questions, as always, please don’t hesitate to contact us.

How is the “Incident Investigation” course different from the “RCA Facilitator” course?

The Incident Investigation course covers the process of identifying, obtaining, documenting, and preserving the raw data related to an incident, then constructing a general timeline of the incident.

The RCA Facilitator course then trains our students on how to sort through this data using cause and effect principles to identify causes that are relevant and formulate workable solutions, or preventative measures, to prevent recurrence. There is also emphasis in the RCA Facilitator course on facilitation skills necessary to conduct the Apollo Root Cause Analysis methodology process.

What are the benefits of the Incident Investigation course?

Without truly understanding the key elements and possessing the necessary skills to conduct a thorough, effective investigation, people run the risk of missing key causal factors of an incident while conducting the actual analysis. This could potentially result in not identifying all possible solutions including those that may be more cost effective, easier to implement, or more effective at preventing recurrence. The Incident Investigation course equips students with the knowledge and skills to conduct a proper investigation to prevent this from happening.

Why it is important for people to attend the Incident Investigation course?

Students will learn the key elements and develop the skills necessary to conduct and document a thorough, effective investigation ensuring all the pertinent information is available for the actual root cause analysis process.

Students will learn:

  • The nature of undesirable incidents and why they often repeat
  • The value of a thorough, effective investigation – Why spend the time?
  • Investigation lead and team selection – Matching individual traits and skill sets to the needs of the investigation
  • The roles, or functions that must be filled to ensure thoroughness and reliability of data
  • Possible sources of incident information and how to optimize the value and reliability of incident facts and evidence
  • Demonstrations of misconceptions about the reliability of evidence and how to avoid them
  • Critical interviewing skills for discovering valuable incident information without inadvertently tainting the outcome
  • Options to ensure timely incident response so that valuable evidence can be preserved and collected
  • The value of developing and using standard templates for use throughout the investigation process
  • How to create an incident timeline using multiple sources of information
  • Importantly, scaling the investigation effort based on the significance of the incident to avoid wasting precious resources while ensuring investigation thoroughness
  • Hands-on individual and group exercises for practicing the key elements of the knowledge and skills listed above

Are there any prerequisites for the Incident Investigation course?

No. Students can take the Incident Investigation course on its own, or combine it with the RCA Facilitator course if they wish to learn the ins-and-outs of the Apollo Root Cause Analysis methodology as well. The Incident Investigation course is designed to stand on its own and depending on a person’s role, they may only need to attend one or the other, or both.

Is there an option to train my team in Incident Investigation via a private, onsite course?

Yes. We can work with a team within a company and create a customized Incident Investigation training course that takes into account their specific processes, triggers, industry, regulations, goals, stats of their HSE incidents, and incident severity tiers, and develop a course to their definitions and templates that can then be used to train staff across the company.

For more information, please contact us.



 Click on the infographic for a PDF version. 



Author: Jack Jager

When it comes to problems with quality in your operation, there are the obvious red flags—unhappy clients, defective products, poor reputation, delays, and exorbitant costs, to name a few. But there are other more subtle signs that your quality control department has room for improvement.

Your QC Department Looks Like a Firehouse

Those of us who work in quality control can easily fall into the pattern of fire fighting—running from one issue to the next, solving each problem in the near-term as it crops up. This can work okay for a time, but it’s not a great long-term strategy. When you only focus on solutions and never get down to the root causes that are creating your issues, you will find that the same types of issues keep occurring. “An ounce of prevention is worth a pound of cure” should be the mantra of every QC department. It’s worth the extra time up front to get at the root causes of an issue.

Your Quality Folks Aren’t Talking Cents

The universal language of business is dollars and cents, so if your quality control department isn’t translating your issues into actual cost to the business, they might not be heard. For example, you might calculate the cost of the time it takes to close different types of exceptions and add that information to your efficiency evaluations.

There Is a Veil Over the QC Department

Sometimes the quality department is treated differently than manufacturing, engineering, or facilities when it comes to accountability. But it’s very important that QC personnel and their equipment are held to certain standards, too. While QC is often responsible for finding solutions, they also need to be held responsible for their share of the causes—for instance, the impact to the supply chain if raw materials or final product testing is not completed effectively. If there has never been an evaluation of your QC department’s process, it’s definitely time to QC your QC.

Your QC Department Sits in an Ivory Tower

Quality folks can do a much better job if they receive training in other areas, including manufacturing, validation, and project management. When a quality person is too specialized, it can prevent them from seeing the whole picture and finding more comprehensive solutions. If your QC department tends to be resistant to change, that might be a sign that it’s time to expand their horizons with some additional training outside their primary field of expertise.

Anything Short of Total Failure Is Considered Success

Let’s say you work for a chemical plant that manufactures plastic bags. You make a polymer that requires water, but the water you’re using has a bad bacteria in it. There is a corporate requirement that the water be clean, so the bacteria is a problem. However, the finished material passes the test even though there was a deviation earlier in the manufacturing process. So is it really a problem after all? If your client sees a pattern of failure within your process, they will begin to believe that you aren’t truly concerned with quality, even if the final product technically meets the specifications. Make sure that you’re taking all issues seriously, even if they don’t seem to affect the final outcome at first glance.

If any of these scenarios sound familiar, download our eBook “11 Problems With Your RCA Process and How to Fix Them” in which we provide best practice advice on using Apollo Root Cause Analysis to help eliminate problems in your QC process and beyond.

Author: Kevin Stewart

I recently wrote an article about auditing root cause analysis (RCA) investigations, and it only seemed appropriate to follow up with advice on auditing your overall RCA program. Let’s go back to the dictionary definition of “audit” — a methodical examination and review. In my mind this definition has two parts: 1) the methodical examination and 2) the review.

It might help to compare this process to a medical examination. In that case, the doctor would examine the patient, trying to find anything he can, either good or bad. This would include blood work, reflex test, blood pressure, etc. After that examination, he would then review his findings against some standard to help him determine if any action should be taken. Auditing an RCA program is no different; first, we must examine it and find out as much as we can about it, then we will need to review it against some standard or measure.

In my other article, I discussed at length the measures against which an RCA investigation could be judged. Those still apply, and one of the program audit items can and should be the quality of the RCA investigation.

Now we are faced with determining the characteristics of a good program. A list of characteristics is given below:

  • Quality of RCA investigations
  • Trigger Levels are set and adhered to
  • Triggers are reviewed on a regular basis and adjusted as required to drive improvement
  • A program champion has been designated, trained and is functioning
  • Upper management has been trained and provides invloved sponsorship of the program
  • Middle management has been trained and provides involved sponsorship of the program
  • The floor employees have been trained and are involved in the process
  • The solutions are implemented and tracked for completion
  • The RCA effectiveness is tracked via looking for repeat incidents
  • Dedicated investigators / facilitators are in place
    • Investigators are qualified and certified on an ongoing basis
  • All program characteristics are reviewed / defined / agreed to by management and include: An audit system is defined, funded, and adhered to
    • Resource requirements
    • Triggers
    • Training requirements are in place and funded
    • Sponsorship statements and support
  • The RCA program is incorporated into the onboarding and continuous review training for new and existing employees

The next step in developing an audit is to generate a set of items that your program will be gauged against. This list can come from the items above, your own list, or a combination of the two. Once you have a final list of items to audit against, you need to generate a ratings scale. This can be a pass/fail situation or a scale that gives a rating from 0 to 5 for each item. This can allow you to give partial credit for some items that may not quite meet the full standard. You can also provide a weighting scale if deemed appropriate. This would mean that some of the items in the list had more importance or weight in the scoring based on the local feelings or culture of your facility. This scale can be anything you wish, but be cautious about making the scale too large. Can you really tell the difference between a 7 or 8 in a 10-point scale? Perhaps a 1 – 4 scale would be better?

Next, develop a score sheet with each item listed and a place to put a score for each one. It’s handy to add some guidelines with each item to give the reviewer a gauge on how to score the item. A sample of such guidelines might look like:

0    Does not exist

1    Some are in place but not correct

2    Many are in place and some are correct

3    All are in place but only some are correct

4    All are in place and most are correct

5    All are in place and correct

Don’t forget to leave a space for notes from the reviewer to explain the reasons for partial credit. With this in place either next to each item or easily available as a reference, it helps ensure consistency in the scoring, especially if multiple people will be scoring your RCA program.

The goal for a standardized audit process would be that several different people could independently review and score a program and would come up with essentially the same score. This may seem like a simple thing, but it turns out to be the largest issue because everyone interprets the questions slightly differently. There are several things you can do to minimize discrepancies:

  1. Provide the information above to help.
  2. Require the auditors to be trained and certified by the same process/people and then have them provide a sample audit and check it against the standard. Review and adjust any discrepancies until you are sure they will apply the same thinking against the real audit.
  3. Always ensure that if multiple auditors are used in a program review, at least one has significant experience to provide continuity. In other words, don’t allow an audit to be done with all first-time auditors.

With these measures in place, all you have to do is review the RCA program against your list, score it, and have some sort of minimum for passing. Likewise, you’ll want to have some sort of findings report where the auditor can provide improvement opportunities against the individual items instead of simply saying: “did not pass.”

These measures will ensure that the program is gauged against a consistent standard and can be repeated by multiple auditors. There will always be differences if multiple people are auditing an RCA program, but by utilizing the steps above these differences can be minimized to provide the highest level of credibility for the audit.


According to a definition applicable to the insurance industry, an accident is an event which is not deliberately caused and which is not inevitable[1]. A typical insurance policy has a significant number of exclusions that are the “evitable” circumstances.

Logically, any situation which is reasonably evitable and which likely has harmful consequences ought to have been identified. bigstock-work-injury-clai-52aa87fb42b761a2c2c045ffb29402c7c8aa2d0a

Those of us who are the safety leads at our organizations have a lot riding on our shoulders. That pressure gives us a constant incentive to improve, because we can never do our job too well. This post highlights some of the questions we ask ourselves that ultimately ladder up to the larger question, “How can we do better?”

For example:

1. How many injuries have been recorded at your location(s) in the past year? 

The often cited adage “you can’t manage what you don’t measure” is pertinent here.

Data is king; knowing how many injuries have been recorded at all locations for your enterprise will not only enable comparisons between sites and an analysis of the common and different causes, but also can be used to motivate greater improvements at the lesser site(s).

2. Does that number include the near misses? Or aren’t they reported?

The expression “near miss” clearly indicates a close call, but all too frequently it occasions relief rather than analysis. This is because people look on the bright side and put the escape down to good luck. Overcoming this complacency is a challenge. The issue for the organization is that all too often these events are simply not reported, or reported too long after the event to enable an accurate re-construction of the event. This compromises the ability to derive any “lessons learned” that could generate appropriate improvements.

3. Would you know if the near misses hadn’t all been captured? 

The simple fact is that “you don’t know what you don’t know”; this situation calls for a process of acknowledgement, if not reward, so that the incident participants have no fear of punitive measures being applied when they report the circumstances of the near miss. This necessitates the clear communication of a “no blame” philosophy. If employees feel that they will suffer some negative consequence they will be loath to volunteer information about the near-miss incidents.    

 4. Is your record improving?

Unless the data is being promptly collected, accurately recorded, and analyzed, trending will not be possible and improvement not apparent. The objective is to have a demonstrable improvement evidenced by the statistical record. The accuracy of this data will depend not only on the creation of the “no blame” culture but also on the refinement of the methodology and tools employed in the investigation of incidents.

5. Have you set targets for improvement? bigstock-Dartboard-with-d-f4d3b5fe1b60fb34e4354eaa55ca093fa0e3729c

Establishing fresh targets and goals periodically is the only way to ensure the improvement is continuous. Even a site with an almost blemish-free record needs to be totally vigilant about the changes that are being undertaken there. Change is the only constant and, regretfully, is also an opportunity for hazards and harm to arise. The fresh targets ought to be reflected in the key performance indicators (KPIs) applicable to the respective safety roles for your enterprise.

6. Are there any unidentified hazards facing the personnel? 

Only systematic inspection and auditing processes will reveal previously unrecognized hazards. The certainty that you have minimized risks and hazards will grow proportionally as the employees who encounter the hazards demonstrate their ownership of the safety program. They have the ultimate control of the likely causes of their own potential harm. But whether the personnel have accepted ownership of the program or not, it is incumbent on the responsible officer to implement the specific hazard identification process. This will necessitate close engagement with the plant or equipment operators, technicians, or any person with an exposure to their work environment. Yes, that’s everybody.

There are also hazards of the interpersonal type that may never be apparent to the observer; bullying and stress are increasingly the causes of substantial claims for compensation and can only be detected by building a trusting relationship with the personnel and developing confidentiality protocols.

 7. How effectively are you learning the lessons from each “accident”? shutterstock_153987764-667d5faded20b810456293957c3c5a1b303e526f

The parlance “lessons learned” is commonplace but not consistently applied. These are words that express an intention to make improvements in the organization but all too often focus on the actors in the event rather than the systems and processes that are central to the business.

“Human error” is the categorical expression most commonly heard when blame is being attached and represents a plethora of mistakes that humans make. Discovering that precise error in this unique event and the reason(s) for it can add value and lead to preventive measures being implemented — but not in isolation, not as the so-called “root” cause.

Perfect knowledge, perfect understanding, and perfect operation by all humans in the enterprise are a fantasy. Humans are fallible and accidents will happen if the situation exists.


8. Which causes are “evitable”? 

The “evitable” causes are simply the known, designed, or planned components of the situation – the hardware, equipment, systems, and processes that are used in the production of the goods or service in question. These are all possible causes which, with a human interface, can create hazards with potential negative safety consequences. They are the opportunities for establishing controls or installing barriers that prevent harm.

The safety program needs to identify improvements to the systems or equipment, which would at least minimize the likelihood of a repeat occurrence given the fallibility of the human factor. What are the possible failure modes or the mis-operations that could occur?   

9. Can you demonstrate that you have thoroughly and methodically analyzed every event in order to prevent recurrence?

A thorough and methodical causal analysis is not possible without the creation of a cause map. This is best achieved through a mediated process involving the pertinent stakeholders and subject matter experts and identifying and arranging the proven causes in a logical manner. It needs to be both comprehensive and comprehensible to win the confidence of the decision makers who are looking for recommendations that will effectively modify, substitute, or eliminate the causes.

There are regulatory authorities that have expectations in this sphere and will want to see the assiduous application of a method that has proven to be effective regardless of the industry or problem-type.


ARMS Reliability is here to help you answer these questions. Our free eBook “11 Problems With Your RCA Program And How To Fix Them” is a great first step to figuring out “How can we do better?” Download it here.



While there are three main reasons organizations typically perform Root Cause Analysis (RCA) following an issue with their asset or equipment, there are a whole host of other indicators that RCA should be performed.Cartoon_Man/HardHat

Odds are, you’re recording a lot of valuable information about the performance of your equipment – information that could reveal opportunities to perform an RCA, find causes, and implement solutions that will solve recurring problems and improve operations. But are you using your recorded information to this extent?

First, let’s quickly talk about three reasons why RCA is typically performed:

1. Because you have to

There may be a regulatory requirement to demonstrate that you are doing something about a problem that’s occurred.

2. You have breached a trigger point

Your own company has identified the triggers for significant incidents that warrant root cause analysis.

3. Because you want to

An opportunity has presented itself to make changes for the better. Or perhaps you’ve decided you simply don’t want to lose so much money all the time.

At the core of all industry is the desire to make money. Anything that negatively impacts this goal is usually attacked by performing root cause analysis.Oil And Gas Pipelines

I was having a conversation with a reliability engineer at an oil and gas site, and I asked him what lost opportunity or downtime might cost that company over the course of a year. He said it was in the vicinity of three quarters of a billion dollars – $750,000,000. Is this a good enough reason to perform root cause analysis? Even a 10% change would have a huge impact on bottom line figures.

The monetary impact to the business was of course not due to any single event, but to a multitude of events both large and small.

Each event presents itself as an opportunity to learn and to make any changes necessary to prevent its reoccurrence. Once can be written off as happenstance… things happen, serious or minor, and that’s life. But to let it happen continuously means that something is seriously wrong.

While these are all valid reasons to perform an RCA, there are at least ten more tell-tale equipment-related clues that an RCA needs to happen – most of which can be identified through the information you’re probably already recording.

Here are ten tell-tale signs that your organisation needs to perform Root Cause Analysis:

  1. Increased downtime to plant, equipment or process.
  2. Increase in recurring failures.
  3. Increase in overtime due to unplanned failures.
  4. Increase in the number of trigger events.
  5. Less availability of equipment.
  6. High level of reactive maintenance.
  7. Lack of time… simply can’t do everything that needs doing.
  8. Increase in the number of serious events… nearing the top of the pyramid.
  9. Longer planned “shut” durations.
  10. More frequent “shut” requirement.

These indicators imply that we need to be doing more in the realm of root cause analysis before these issues snowball.

If you can identify with some of these pain points, download our eBook “11 Problems With Your RCA Process and How to Fix Them” in which we provide best practice advice on using RCA to help eliminate some of these problems.


Click on the infographic for a PDF version.