Author: Kevin Stewart

This question was posed to a discussion group and it got me thinking how do you grade an investigation?

The overall success will be whether the solution actually prevents recurrence of the problem.  One definition of Root Cause Analysis is: “A structured process used to understand the causes of past events for the purpose of preventing recurrence.” So a reasonable assessment of the quality of the analysis would be to determine whether the RCA addressed the problem it set out to fix by ensuring that it never happens again (this may be a lengthy process to prove if the MTBF of the problem is 5 years, or has only happened once). bigstock-Blank-checklist-on-whiteboard--68750128.jpg

Are there some other tangibles that can help you assess the quality of an RCA?  RCAs use some sort of process to accomplish their task. If this is the case then it would stand to reason that there will be some things you can look for in order to gauge the quality of the process followed. While this is no guarantee of a correct analysis, ensuring that due diligence was followed in the process  would lend more credibility to the solutions.

What are some of these criteria by which you can judge an analysis?

  • Are the cause statements ‘binary’? By this we mean unambiguous or explicit. A few words only and precise language use without vague adjectives like “poor” since they can be very subjective.


  • Are the causes void of conjunctions? If they have conjunctions there may be multiple causes in the statement. Words such as: and, if, or, but, because.


  • Is there valid evidence for each cause? If causes don’t have evidence they may not belong in the analysis or worse yet solutions may be tied to them and be ineffective.


  • Does each cause path have a valid reason for stopping that makes sense? It is easy to stop too soon and is sometimes obvious. For example, if a cause of “no PM” has no cause for it so that the branch stops, it would seem that an analyst in most cases would want to know why there was no PM.


  • Does the structure of the chart meet the process being used? If it is a principle-based process then it should be easy to check the causal elements to verify that they satisfy those principles. These might be causal logic checks or space time logic checks or others that were associated with the particular process.


  • Is the chart or analysis completed? Does it have a lot of unfinished branches or questions that need to be answered or action items to complete?


  • Is the chart or analysis completed? Does it have a lot of unfinished branches or questions that need to be answered or action items to complete?


  • Are the solutions SMART (Specific, Measurable, Actionable, Relevant, and Timely)? Or do they include words like: investigate, review, analyze, gather, contact, observe, verify, etc.


  • Do the solutions meet a set of criteria against which they can be judged?


  • Do the solutions address specific causes or are they general in nature?  Even though they may be identified against specific causes if they don’t directly address those causes then it may still be a guess.


  • If there is a report, is it well written, short, specific and cover just the basics that an executive would be interested in? Information such as cost, time to implement, when will it be completed, a brief causal description and solutions that will solve the identified problem are the requisites.


These are some of the things that I currently look at when I review the projects submitted by clients. I’d be interested to know about other things that may be added to the list.

describe the image

The key to efficiency is found along the shortest path between any two points.

It is remarkably simple to think of an efficient operation as one that runs in a straight line. Getting from point A to point B is rather “straight forward” they say.1

The challenge is to step back far enough from the daily nuances to be able to see the path we propose. With a clear view, we can see if it is relatively straight or if it is “remarkable” (in its curviness).

In order to travel the path of “straightness”, we need to understand each step we must take along the path. This allows us to understand which steps are then deemed as “extra steps” and are wasted energy without value. Knowing which steps we do not need helps us sharpen the focus on those we do need.

The “extra steps” that do not need to be taken – do not need to be taken.

They are simply wasted energy.

In Reliability circles, the Path between point A and Point B is the path between the RCM study and the fully prepared CMMS system that has delivered a new work instruction document to the technician. Arguably they contain more than just a few simple steps as I have illustrated.

This has added complexity when you are confronted with so many new technical jargons like maintenance items, schedule suppression, document information records, PRT, task lists, secondary tasks, and the like. The hidden purpose of these terms might seem like it is to simply confuse the issue, so set them aside for now, and just focus on getting from point A to point B.

The action of integrating a process for efficient action invokes a myriad of words that include “combination, amalgamation, unification, merging, fusing, meshing and blending”. The use of a consistent tool, like the Reliability Integration Tool (RIT), allows you to navigate this jargon with a simple process and traverse from point A to Point B easily.

When we integrate RCM with a leading CMMS like SAP or MAXIMO® we are faced with many additional choices. These additional choices arise because the flexibility in modern CMMS systems has evolved to service a broader spectrum of the market. The market has pushed the CMMS designer, to allow for their CMMS to fit a great many organisations easily. This means the CMMS can probably do anything, but in doing so the CMMS can also do several things you probably don’t need.

Knowing which CMMS features you need now is perhaps the most important “current issue” you will have to solve.

Key in this choice is to not install what I call a “Glass Ceiling”. A glass ceiling is an artificial barrier which limits your organisation growth, because you have configured your CMMS to accidently retard future growth. This can be avoided if you know which CMMS features you will “need in the next three years” before you lock down how your CMMS should operate today.

Today, your goal is still to get from point A to point B. Deliver into the hands of the waiting technician a fully featured professional work instruction when it is required, using the data from your RCM study.

To illustrate the point a little more clearly, let us consider we own a new large piece of equipment.

To ensure the equipment provides many years of trouble free operation, you will apply the RCM method to generate the “content” needed to prepare your initial maintenance strategy.

We can call this Point A!

It is with the application of this initial strategy and some improvement activities, you intend to operate the new asset, following the straightforward, prudent application of maintenance when it is needed, not before or later than needed. This is all considered best practice stuff – well done!

Now – let’s define Point B as the Preventative Work Instruction you will print from a CMMS work order and hand to your technician. This document is very important because it will serve as the transfer vehicle for all of your hard work. Recall you started with RCM preparing the maintenance strategy and have transitioned to the work instruction content that the technician can execute.

The challenge is of course getting the CMMS to print this document, on time, not early, and complete in the format needed by the technician.

This is not as easy as it sounds.

Understanding the underlying RCM Analysis database tables alone is complex. Aligning the RCM tables to the CMMS database tables is complex integration work that forces the data into load sheets for each CMMS table. This typically is a format that few understand well.

Factor in the requirement to produce a work instruction document using a standard template that looks like a professional work Instruction document, will generally involve a “heap” of work.

Faced with such a large amount of work, we all want an “easy” way out. What you need is a simple to use, consistently formatted set of tools that help you get from point A to point B. It is important to know that such a set of tools exist and they are easy to use. The ARMS Reliability’s Reliability Integration Tool (RIT) is the leading example of one of the tools currently available.


The individual toolset items take more space than that provided in this blog, so I will leave you with a teaser.

The tools do exist – and working with ARMS Reliability we can help you travel from point A to point B easily. We can help you do this without wasted steps and produce professional work instruction documents AND also load SAP, MAXIMO and most every other CMMS known to man.

To learn more:

This is just one of the many topics we will cover at the Reliability Summit 2014, October 27 to 30th. For further event details on speakers, topics, workshops, visit If you would like to discuss in further detail the Reliability Integration Tool with one of our consultants, please contact us at

Reliability Summit Monochrome


By Kevin Stewart

With all the preparation work (Honing your Facilitation Skills: Part 1) behind you, you’re now ready to start facilitating an Apollo Root Cause Analysis. Follow the steps below to ensure a smooth process and successful outcome.

Step 1. Introductions

First, do some simple introductions and housekeeping. Cover things like:

  • Introductions all around
  • The meeting guidelines: when to take breaks, phone and email policy, and so on
  • The objective: we’re here to fix the problem, not appoint blame
  • A review of the Apollo Root Cause Analysis methodology for those who may not be familiar with it (spend 15 – 45 minutes depending on the audience)
  • Your role as facilitator: you may need to ‘direct traffic’ or change the direction of discussions to help them discover more causes or to reach effective solutions

Step 2. Timeline

It’s now time to capture the ‘story’. What has happened that brought you all here? Get several people to provide a narrative, and develop a timeline of events as you go.

This timeline will prove very useful. It should reveal the event or issue that becomes your primary effect or starting point – and ensures that all the items beyond this starting point capture the group’s issues.

In the example below, if I start from T1 I’ll discover why I left my iPad in the bathroom.  However if I start at T7 I will also discover why my check process didn’t function as desired.

Date Time Event Comment
T1 Leave iPad in department restroom stall
T2 Meet wife
T3 Have lunch
T4 Return to car to leave
T5 Wife asks if we have everything before we leave
T6 Pat pocket and look, run through check list
T7 Head home without iPad
T8 Get call halfway home asking if i have iPad

While the time that each event occurs is important, it might not always be known. In these instances, you can represent the time sequence as simply T1, T2 and so on.

Step 3. Define the problem

You’re now ready to define the problem. Often, the problem definition comes out easily and everyone agrees. However, sometimes you’ll find that the group can’t arrive at a Primary Effect. In this case, as facilitator, it’s your job to regroup and ask some questions about why everyone is interested. Often, it’s about money.

One thing you don’t want to do is get stuck trying to find the perfect starting point. I’m reminded of a saying I heard once:

Dear Optimist and Pessimist,

While you were trying to decide if the glass was half empty or half full, I drank it!


The Realist

The Apollo Root Cause Analysis methodology is robust enough to handle an imperfect starting point. If the problem changes or evolves as you go, just put it down as the new starting point, adjust the chart and go on!

Now that you have a defined problem, with its significance well understood, you’re now ready to start the charting process. The team should also know by now why they’re here, and how much time and money can be spent on the investigation.

Would you like to learn more about the Apollo Root Cause Analysis methodology? Our 2 Day Root Cause Analysis Facilitators course is perfect for anyone needing to understand fundamental problem solving processes and how to facilitate an effective investigation.

Reduce costs and boost output by getting more out of your assets.

sweating_the_assetsAcross many industries, business conditions have become even tougher. Companies need new and smarter ways to boost margins if they want to remain successful into the future.

Economically, the short-term outlook for many industries is grim. Rising costs and falling commodity prices are putting pressure on the bottom line. Globally, demand in China is falling and businesses must become more competitive to survive. In this tough environment, companies need new ways to increase their margins.

This paper explores a range of methodologies for making assets run more productively, and to increase their value or reduce their costs in a substantial way that delivers long-term results for business. Read More →

CEO, ARMS Reliability

If your asset management strategy is driven by a dynamic RCM simulation model, it means your forward labour predictions can take into account the likely corrective maintenance (unplanned), as well of course the proactive strategies being followed.

It is often the unplanned element that breaks budgets and upsets planned maintenance, so forward failure forecasts through updated models using the latest equipment history, will help initiate investigations into root cause and/or optimisation studies. It is these activities that help continue to drive down the corrective maintenance in a deliberate manner. For those reasons the earlier you start making decisions according to dynamic models rather than static experience or subjective based decisions, the sooner you can start making improvements. However, it is never too late to start. The start point just determines how much history (read failures) you have to build your models. Read More →

ARMS Reliability are getting ready for the annual AMPEAK conference this June. ARMS will be presenting and exhibiting at this event.

Michael Moulton, Software Sales & Training

The AMPEAK conference has been organized by the Asset Management Council Australia.  Over 4 days, 300+ asset management and maintenance professionals from Australia and around the globe will receive access to the most up-to-date asset management information across a broad range of industries via presentations, tutorials and workshops.

WA Lead Engineer, Weylon Malek, will be presenting ‘Production Reliability Analysis to Improve Asset Management’, and Apollo Trainer and Facilitator, Jack Jager, will be presenting ‘6 Critical Steps for Facilitating a Successful Root Cause Analysis.

You can see Weylon’s presentation on Tuesday, 3rd June at 1.30pm, and Jack’s presentation on Thursday 5th June at 11.30am.

Don’t forget to stop by the Exhibit area where you can see Michael Moulton at the ARMS booth.

Conference Details:
Perth, WA | 2-5 June | Crown Convention Centre

AMPEAK Conference Website

By Kevin Stewart

A facilitator conducting a Root Cause Analysis using the Apollo method performs a crucial role throughout an investigation. Here are some tips and steps to keep in mind when facilitating:

Over many years, I repeatedly hear that the ‘Apollo Root Cause Analysis methodology is only used for big, serious investigations.’ This statement always makes me smile – because it is completely untrue.

An RCA using the Apollo Root Cause Analysis methodology can be performed on any problem, large or small, as long as the right facilitator is on board. This article, part 1 of 2, explores the strategies and processes a facilitator should keep in mind when an investigation proceeds.


In my Apollo Root Cause Analysis methodology training classes, I always ask whether anyone is a certified facilitator. I’ve only received one ‘yes’ from the 2,000 or so students that have attended my courses. This sole person will have been trained in how to manage a group of different personalities; how to progress a group towards its goal; how to be firm and fair; and so on.

Yes, these are valuable skills to learn. And, in an ideal world, every facilitator would have the time and resources to complete the training. But you can facilitate a Root Cause Analysis using the Apollo Root Cause Analysis methodology without this certification.

Facilitating RCAs requires flexibility – yet it also requires that you follow a standard outline. While every RCA has its own path, it will generally adhere to these main steps:

  1. Gather information
  2. Define the problem
  3. Create a Realitychart
    a. Phase one: Create the draft RealityChart™
    b. Phase two: finish and formalise the RealityChart™
  4. Identify solutions
  5. Finalise the report

The process – as laid out above in its basic format – may look a little daunting to someone who has never facilitated an RCA before. Particularly, if you are contending with other feelings – like being anxious in front of a crowd, or feeling responsible for the outcome. You will need to deal with these latter issues in your own way.

What you can take charge of is finding a way to shape a group of disparate people into a highly functioning team, who share the common goal of reaching a solution. By following the steps below, you can prepare for a smooth facilitation process.


Step 1. Familiarise yourself with the Apollo Root Cause Analysis methodology.

First, ensure you are familiar with the Apollo Root Cause Analysis methodology – after all, it’s what you’re trying to facilitate. If you need a review, the RealityCharting™ learning centre is a great place to visit to recap on the basics. Here, you can complete a simulated scenario to really fine-tune your understanding of the process.  It would also be a good idea to review the facilitation guidelines in the manual that you received with your original training.  It gives an excellent overview of the entire process.

Step 2. Gather your supplies.

Stock up on post-it notes – and get the good, super-sticky ones that will stay on the wall.

We suggest that you use post-it notes instead of a computer to perform the analysis, as these help to enhance the common reality.  With post-it notes, all participants can see what’s happening.

If you think the analysis will take a few days, get multiple colours of post-it notes so you can easily distinguish between the changes to the chart created on different days or at different times.

Ensure the room you’re working in has plenty of wall space. And, if the walls are unsuitable for post-it notes, tape poster paper to the wall first and then adhere your post-its. Using paper can provide the extra advantage of making the chart easy to remove and take with you.  If it’s sensitive subject matter, you can roll it up and take it with you at the end of the day.

Step 3. Prepare the participants.

Ensure that all participants know what to expect before beginning an RCA. An RCA can require a significant time commitment, so make it clear from the outset how much time is needed from them.

Step 4. Gather information.

The more information you have at the outset, the smoother the journey.

You may already have information at hand in the form of pictures, emails, reports, write-ups, witness statements, and so on.  There may be some useful physical evidence. Request evidence from the right people, collect it and store in the one file.

You may also choose to take the entire team to see the area under investigation, so that everyone has a clear picture in mind about what you’re discussing.

Be aware that, no matter how hard you try, there will always be some missing information.  This is not a problem. You can call someone, look it up at the time, or make an action item for someone to gather the evidence later.

By Kevin Stewart

One definition of Root Cause Analysis  is:
Root Cause Analysis is any structured process used to understand the causes of past events for the purpose of preventing recurrence.

describe the imageThis basic premise is the reason that the RCA is done.

On the surface, it always appears to be a simple matter, however there are always pitfalls and nuances.

One such pitfall that RCA investigators or facilitators face is something I call the “problem is fixed” syndrome. In my work at plants I would run across situations where a problem occurred and a solution was implemented. The particular solution used may or may not have been arrived at by using RCA. In either case the solution is implemented and the “problem is fixed.”

How is this statement validated as being true? Those involved will justify the solution by the simple fact that the problem hasn’t recurred, at least not in the immediate future, which unfortunately is sometimes the focus of plant management due to pressures, career goals or other reasons. On the surface this may seem to be difficult to argue – after all the problem is fixed – or is it?

In the cases I have been involved with, what has really happened is that the MTBF (Mean Time Between Failure) of the problem is actually a long time, say 5 years or greater. I was involved in two investigations where the incident hadn’t happened in the previous 5 years and most likely wouldn’t happen for another 5 years. Investigations had been performed and solutions were offered and implemented.

When asked about the effectiveness of the solutions the evidence given was that the incident hadn’t recurred so the solution must have been effective. On the surface this may appear to be difficult to argue back, since it is true that the problem hasn’t recurred. However by looking at the MTBF of the incident, you can point out that since the MTBF is long the effectiveness of the solution put in place will not be known until the problem recurs at some time in future. So at this particular time no solution, or any other proffered solution would be just as effective since the problem won’t recur anyway. You can easily see where if a facility is not careful they could be “fixing problems” with long MTBF’s claiming success and in reality not have actually provided effective solutions. This argument supports a thorough and complete RCA that is based on the cause and effect principle and are supported by evidence to insure an effective solution is implemented.

In one of the cases above the solution was to do more frequent maintenance to insure the problem was identified. While this would have worked for anything that had a MTBF longer than the frequency chosen it would not have worked for something that had a MTBF less than the frequency chosen. In addition to a solution that would not work in all cases it would have increased the cost of maintenance significantly. In this particular case a little more investigation and adding some additional causes to the chart identified that some external damage had been done and not reported, which caused the issue. If they could fix the unreported damage issue then an effective solution would be found that covered the situation that brought this incident on, it also would most likely fix other incidents that hadn’t even happened yet.

In this case you can see that the offered solution would have appeared to work just fine and since they did “something” everyone feels good about the work and “effective” solution.

The other incident was caused by someone who had recently returned to work after an extended leave. During an operating situation this employee correctly followed the incorrect procedure that was posted at the unit. The solution was to replace the incorrect posted procedure that was found to be incorrect at an operating unit. While replacing the procedure was necessary, they would not know if it is effective for quite a while. Again a little more investigation and a few more causes identified that there was no process to replace modified procedures around the plant. If this was fixed then an effective solution would be identified. You can see that here also the plant management would be thrilled because and investigation was done, something was put in place and the problem hasn’t happened again. I’m sure you can see that this situation very well could happen again either at this unit or other similar pieces of equipment.

Both of these examples also point out that a good RCA must be done using valid principles and evidence for the causes and you must not stop too soon! Stopping too soon is another common mistake in RCA – but that is another tip.

In the meantime be aware of incidents with long MTBF and offered solutions that are not based on good analysis or inappropriate causes.



What are your thoughts on conducting an RCA facilitation / Investigation and how much time have you spent preparing the analysis and implementing solutions?  Do you have a successful tip worth sharing or discussing? We look forward to reading your feedback and perspective via comments below or let’s connect on our LinkedIn Group – ARMS Reliability – Reliability & RCA for further discussion.


ARMS Reliability’s CEO, Mick Drew, recently attended the COO Leaders Resources Summit held on the Gold Coast. The summit included a number of the resource sectors most influential executive management and operators, and was two days of corporate and management level discussions concerning some of the most important issues currently facing the Australian resources sector.

Mick Drew said, “It was great opportunity to engage in frank discussions with some of the industry’s leaders. They told us more about their unique challenges, and what issues are impacting their companies.” Mick continued, “They wanted to understand more how we can use our expertise and experience to make a positive change to their maintenance practices, asset management and bottom line.”

From the various panel discussions, workshops, and one-on-one meetings, it was clear there is high demand for experts in Reliability & Maintenance who are able to offer guidance and clarity around the key challenges and issues facing the mining industry. Read More →

Become part of a vibrant community that will share knowledge, experience and innovation.

Mainstream ConferenceMainstream 2014 kicks off in Perth on Monday, 12th May. It’s an exciting event where Asset Management leaders and teams come together to share knowledge, experience and innovation.

Mainstream is different to other asset management conferences; it’s an interactive experience rather than a sit-and-listen event. With 40+ sessions, workshops, roundtables, panel discussions and live interviews, you’re sure to find a session that will be of interest to you. Read More →