Author: Kevin Stewart

At some point, most companies will want to see quantifiable metrics showing that their Root Cause Analysis (RCA) program has resulted in a positive return on investment (ROI).

ROI is relatively easy to calculate as a dollar value when it comes to tangibles such as equipment or production time. Things can seem trickier when trying to assign a dollar value to safety improvements resulting from an RCA program. Try to keep it simple.

This formula –

Cost of the Problem x Likely Recurrence / Cost of the Fix = ROI

is a straightforward way to begin quantifying the ROI of your RCA program, including its effects on safety.

Let’s look at how we might calculate these costs.

Cost of the Fix

  • Cost of an RCA investigation (you may need to include the initial training, though this should drop off as it is amoritized out over the program, as well as whatever time, resources, and people are required to conduct the investigation itself).
  • Cost of whatever resources are needed to implement a solution. Don’t forget to include new equipment, parts, additional training, and anything else that is directly attributable to the implementation.

When you eliminate a problem, calculating what you have saved depends a lot on the problem itself and what its rate of reoccurrence is. For instance, if you figure out what was causing a particular machine to fail at a rate of once/year, you won’t see the benefits of your solution for another year. It can take several years and solving many different problems to see the total value of an RCA program.

Improved safety isn’t as impossible to quantify as it might seem. While most companies don’t publicly discuss this type of equation because it can seem insensitive, chances are your company does calculate the monetary cost of an injury or death on the job. These figures may be a bit outdated, but the Mine Safety and Health Administration at the US Department of Labor offers an online calculator, which takes into account both direct costs (like workers’ comp claims) and indirect costs (like training a new worker and lower morale), as one example.

Cost of the Problem Reoccurring

Cost of the initial problem in equipment, production delays, man hours, workers’ comp claims, medical costs, absenteeism, turnover, training new employees, lower productivity, decreased morale, legal fees, increased insurance costs.

At first glance the equation doesn’t quite make sense for a safety “near miss.” If it missed then what did it cost? Is the answer nothing? So the ROI is:  0 x likely recurrence/cost of the fix = 0? The answer obviously must include the potential cost. The cost to the business if the issue was on target and hadn’t missed. It all becomes subjective then. How do you put a cost on maybes?

It might help to look at the statistics of how an incident occurs. Take the cost to the business if a single major accident occurred (every business has this unspoken cost locked away somewhere) and then very simply do the math. One near miss will be worth 0.003 of that cost. Tally up your near misses and now go back to the formula.

AccidentPyramid_V2

As an example, say your data indicates you have 3000 near misses in two years, or 4.1 incidents per day. Then you put a program in place and now you have 3000 near misses in four years, or 2.1 incidents per day. This translates to 3000 fewer near misses in two years time. Per the above calculations, this would generate 3000 x 0.003 or nine fewer major incidents at whatever cost your company assigns to that type of incident. This becomes the savings for your ROI (or the Cost of the Problem in our equation) and can be attributed to the safety program of which the RCA process is a part.

This formula will assist in calculating an ROI on an individual RCA, which is necessary to show that the process is working and providing value so you can justify the program. However, since most safety programs track TRIR (Total Recordable Injury Rate) or something to that effect, you will also need to show that the RCA program affects this, too. This will be difficult because the safety program is in place and doing other things to prevent safety incidents before they happen. How do you attribute a reduction in near misses to preventive programs versus items put in place from an RCA?

You may never be able to separate these items. Even with detailed records, it is not always clear why people do what they do. The best thing you can do is to track when an RCA program was incorporated and then show the improvement in your safety metric, in TRIR, or near misses.

You can use this information to justify the program with the argument that the RCA process is part of the overall safety program and it really doesn’t matter which gets the credit as long as we have continued to drive safety improvements. The RCA program should be a small part of the overall safety program costs since there are usually several full time safety people involved, committee meetings, safety initiatives, programs, etc.

It doesn’t matter how you slice and dice it, the return on investment for your RCA program boils down to: What will it cost me to fix the problem now? – versus – What is the cost if this problem happens again?

Author: Jack Jager

An effective root cause analysis process can improve business outcomes significantly. Why is it then that few organisations have a functioning root cause analysis process in place?

Here are the top 6 sure-fire ways to kill off a Root Cause Analysis program

1. Don’t use it.

stop-hand

The company commits to the training, creates an expectation of use and then doesn’t follow through with commitment, process and resources! Now come on, how easy is it to devalue the training and deliver a message that the training was just to tick someone’s KPI box and that the process doesn’t really need to be used.

2. Don’t support it.

Success in Root Cause Analysis would be the ultimate goal of each and every defect elimination program. To achieve success however, requires a bit more than just training people in how to do it. It requires structures that initially support the training, that mentor and provide feedback on the journey towards application of excellence and thereafter have structures that delineate exactly when an investigation needs to take place and that delivers clear support in terms of time and people to achieve the desired outcome. Without support for the chosen process the expected outcomes are rarely delivered.

3. Don’t implement solutions.

To do all of the work involved in an investigation and then notice that there have been no corrective actions implemented, that the problem has recurred because nothing has changed, has got to be one of the easiest ways to kill off a Root Cause Analysis process. What happens when people get asked to get involved in RCAs or to facilitate them when the history indicates that nothing happens from the efforts expended in this pursuit? “I’m too busy to waste my time on that stuff!”

 

4. Take the easy option and implement soft solutions.

Why are the soft controls implemented instead of the hard controls? Because they are easy and they don’t cost much and we are seen to be doing something about the problem. We have ticked all the boxes. But will this prevent recurrence of the problem? There is certainly no guarantee of this if it is only the soft controls that we implement. We aren’t really serious about problem solving are we, if this is what we continue to do?

5. Continue to blame people.

The easy way out! Find a scapegoat for any problem that you don’t have time to investigate or that you simply can’t be bothered to investigate properly. But will knowing who did it, actually prevent rectraining your staff urrence of the problem?

Ask a different question! How do you control what people do? You control them or more correctly their actions by training them, by putting in the right procedures and protocols, by providing clear guidelines into what they can or can’t do, by creating standard work    instructions for everyone to follow and by clearly establishing what the rules are in the work place that must be adhered to.

What sort of controls are these if we measure them against the hierarchy of controls? They are all administrative controls, deemed to be soft controls that will give you no certainty that the problem will not happen again. We know this! So why do we implement these so readily? Because it is the easy way out! It ticks all the boxes, except the one that says “will these corrective actions prevent recurrence of the problem?”

We all understand the hierarchy of controls but do we actually use it to the extent that we should?

6. We don’t know if we are succeeding because we don’t measure anything.

You get what you measure! When management don’t implement or audit a process for completed RCAs it sends a strong message that there is no interest, or little, in the work that is being done to complete the analysis.

Tracking KPIs like, how many RCAs have been raised against the triggers set? How many actions have been raised in the month as a result and, of those actions raised, how many have been completed? If management is not interested in reviewing these things regularly along with the number of RCAs subsequently closed off in a relevant period, then it won’t be long before people notice that no one is interested in the good work being done.

The additional work done to complete RCAs will not be seen as necessary, as it’s not important enough to review and the work or the effort in doing this will then drop away until it’s no longer done at all.

measuring success

Another interesting point is that if only the number of investigations is reported, and there is no check on the quality of the analysis being completed, then anything can be whipped up as no one is looking! If a random audit is completed on just one of the analyses completed in a month then this implies that the quality of the analysis is important to the organisation.

What message do we send if we don’t measure anything?

 

 

In closing, the first step on the road to implementing an effective and sustainable Root Cause Analysis program is to pinpoint what’s holding it back. These Top 6 sure-fire ways to kill off a Root Cause Analysis program will help you identify your obstacles, and allow you to develop a plan to overcome them.

Screen Shot 2014-09-25 at 4.28.39 pm

improve-reliabilityPhilip Sage, Principal Engineer

An “unreliable” manufacturing process costs more money to operate.

Management “always says” we need to improve.

Individually, we know that “You cannot improve what you do not measure”.

So we must conclude if we want to make our process “reliable” we must measure the process reliability.

The search for measurable data that can be utilised may seem hard.  Equally hard could be a high level understanding of when a process is reliable, and what specifically must a process exhibit to be deemed reliable? Read More →

This question was posed to a discussion group and it got me thinking how do you grade an investigation?
The overall success will be whether the solution actually prevents recurrence of the problem.  One definition of Root Cause Analysis is: “A structured process used to understand the causes of past events for the purpose of preventing recurrence.” So a reasonable assessment of the quality of the analysis would be to determine whether the RCA addressed the problem it set out to fix by ensuring that it never happens again (this may be a lengthy process to prove if the MTBF of the problem is 5 years, or has only happened once).quality-blocks1

Are there some other tangibles that can help you assess the quality of an RCA?  RCAs use some sort of process to accomplish their task. If this is the case then it would stand to reason that there will be some things you can look for in order to gauge the quality of the process followed. While this is no guarantee of a correct analysis, ensuring that due diligence was followed in the process  would lend more credibility to the solutions.

What are some of these criteria by which you can judge an analysis?

  • Are the cause statements ‘binary’? By this we mean unambiguous or explicit. A few words only and precise language use without vague adjectives like “poor” since they can be very subjective.
  • Are the causes void of conjunctions? If they have conjunctions there may be multiple causes in the statement. Words such as: and, if, or, but, because.
  • Is there valid evidence for each cause? If causes don’t have evidence they may not belong in the analysis or worse yet solutions may be tied to them and be ineffective.
  • Does each cause path have a valid reason for stopping that makes sense? It is easy to stop too soon and is sometimes obvious. For example, if a cause of “no PM” has no cause for it so that the branch stops, it would seem that an analyst in most cases would want to know why there was no PM.
  • Does the structure of the chart meet the process being used? If it is a principle-based process then it should be easy to check the causal elements to verify that they satisfy those principles. These might be causal logic checks or space time logic checks or others that were associated with the particular process.
  • Is the chart or analysis completed? Does it have a lot of unfinished branches or questions that need to be answered or action items to complete?

qualifying criteria

  • Is the chart or analysis completed? Does it have a lot of unfinished branches or questions that need to be answered or action items to complete?
  • Are the solutions SMART (Specific, Measurable, Actionable, Relevant, and Timely)? Or do they include words like: investigate, review, analyze, gather, contact, observe, verify, etc.
  • Do the solutions meet a set of criteria against which they can be judged?
  • Do the solutions address specific causes or are they general in nature?  Even though they may be identified against specific causes if they don’t directly address those causes then it may still be a SWAG*.
  • If there is a report, is it well written, short, specific and cover just the basics that an executive would be interested in? Information such as cost, time to implement, when will it be completed, a brief causal description and solutions that will solve the identified problem are the requisites.

These are some of the things that I currently look at when I review the projects submitted by clients. I’d be interested to know about other things that may be added to the list.

* SWAG =  Scientific Wild Ass Guess

training_footer_ad1-resized-600.jpg

The key to efficiency is found along the shortest path between any two points.

It is remarkably simple to think of an efficient operation as one that runs in a straight line. Getting from point A to point B is rather “straight forward” they say.1

The challenge is to step back far enough from the daily nuances to be able to see the path we propose. With a clear view, we can see if it is relatively straight or if it is “remarkable” (in its curviness).

In order to travel the path of “straightness”, we need to understand each step we must take along the path. This allows us to understand which steps are then deemed as “extra steps” and are wasted energy without value. Knowing which steps we do not need helps us sharpen the focus on those we do need.
2

The “extra steps” that do not need to be taken – do not need to be taken.

They are simply wasted energy.

In Reliability circles, the Path between point A and Point B is the path between the RCM study and the fully prepared CMMS system that has delivered a new work instruction document to the technician. Arguably they contain more than just a few simple steps as I have illustrated.

This has added complexity when you are confronted with so many new technical jargons like maintenance items, schedule suppression, document information records, PRT, task lists, secondary tasks, and the like. The hidden purpose of these terms might seem like it is to simply confuse the issue, so set them aside for now, and just focus on getting from point A to point B.

The action of integrating a process for efficient action invokes a myriad of words that include “combination, amalgamation, unification, merging, fusing, meshing and blending”. The use of a consistent tool, like the Reliability Integration Tool (RIT), allows you to navigate this jargon with a simple process and traverse from point A to Point B easily.

When we integrate RCM with a leading CMMS like SAP or MAXIMO® we are faced with many additional choices. These additional choices arise because the flexibility in modern CMMS systems has evolved to service a broader spectrum of the market. The market has pushed the CMMS designer, to allow for their CMMS to fit a great many organisations easily. This means the CMMS can probably do anything, but in doing so the CMMS can also do several things you probably don’t need.

Knowing which CMMS features you need now is perhaps the most important “current issue” you will have to solve.

Key in this choice is to not install what I call a “Glass Ceiling”. A glass ceiling is an artificial barrier which limits your organisation growth, because you have configured your CMMS to accidently retard future growth. This can be avoided if you know which CMMS features you will “need in the next three years” before you lock down how your CMMS should operate today.

Today, your goal is still to get from point A to point B. Deliver into the hands of the waiting technician a fully featured professional work instruction when it is required, using the data from your RCM study.

To illustrate the point a little more clearly, let us consider we own a new large piece of equipment.

To ensure the equipment provides many years of trouble free operation, you will apply the RCM method to generate the “content” needed to prepare your initial maintenance strategy.

We can call this Point A!

It is with the application of this initial strategy and some improvement activities, you intend to operate the new asset, following the straightforward, prudent application of maintenance when it is needed, not before or later than needed. This is all considered best practice stuff – well done!

Now – let’s define Point B as the Preventative Work Instruction you will print from a CMMS work order and hand to your technician. This document is very important because it will serve as the transfer vehicle for all of your hard work. Recall you started with RCM preparing the maintenance strategy and have transitioned to the work instruction content that the technician can execute.

The challenge is of course getting the CMMS to print this document, on time, not early, and complete in the format needed by the technician.

This is not as easy as it sounds.

Understanding the underlying RCM Analysis database tables alone is complex. Aligning the RCM tables to the CMMS database tables is complex integration work that forces the data into load sheets for each CMMS table. This typically is a format that few understand well.

Factor in the requirement to produce a work instruction document using a standard template that looks like a professional work Instruction document, will generally involve a “heap” of work.

Faced with such a large amount of work, we all want an “easy” way out. What you need is a simple to use, consistently formatted set of tools that help you get from point A to point B. It is important to know that such a set of tools exist and they are easy to use. The ARMS Reliability’s Reliability Integration Tool (RIT) is the leading example of one of the tools currently available.

3

The individual toolset items take more space than that provided in this blog, so I will leave you with a teaser.

The tools do exist – and working with ARMS Reliability we can help you travel from point A to point B easily. We can help you do this without wasted steps and produce professional work instruction documents AND also load SAP, MAXIMO and most every other CMMS known to man.

To learn more:

This is just one of the many topics we will cover at the Reliability Summit 2014, October 27 to 30th. For further event details on speakers, topics, workshops, visit armsreliabilityevents.com. If you would like to discuss in further detail the Reliability Integration Tool with one of our consultants, please contact us at info@armsreliability.com

Reliability Summit Monochrome

 

Join the leading global provider of reliability engineering to learn how to get more from your assets, avoid unplanned downtime and reduce operating costs

The ARMS Reliability Summit will be held from 27-30 October on the Gold Coast, Australia, and for the first time, Austin, TX.

save the date blogThe Reliability Summit is a weeklong event that provides the only dedicated event for reliability engineers to learn, network and develop their skills. This year’s event will feature many of the new developments in integrating reliability with asset management and the tools and methods have proven to deliver the fastest, most efficient ways to build a proactive asset management system.

The event will also feature the latest from Apollonian including online RCA, the latest from Isograph including enterprise versions of their traditional software products as well as the latest developments from ARMS including Vulnerability Analysis and integrating plan and work instruction documentation for rapid maintenance strategy implementation. As in past years this event provides opportunities to roll the sleeves up, challenge your knowledge and to network and learn from others experience.register your interest

 

 

For further information and to register your interest, visit armsreliabilityevents.com

Reduce costs and boost output by getting more out of your assets.

sweating_the_assetsAcross many industries, business conditions have become even tougher. Companies need new and smarter ways to boost margins if they want to remain successful into the future.

Economically, the short-term outlook for many industries is grim. Rising costs and falling commodity prices are putting pressure on the bottom line. Globally, demand in China is falling and businesses must become more competitive to survive. In this tough environment, companies need new ways to increase their margins.

This paper explores a range of methodologies for making assets run more productively, and to increase their value or reduce their costs in a substantial way that delivers long-term results for business. Read More →

CEO, ARMS Reliability

If your asset management strategy is driven by a dynamic RCM simulation model, it means your forward labour predictions can take into account the likely corrective maintenance (unplanned), as well of course the proactive strategies being followed.

It is often the unplanned element that breaks budgets and upsets planned maintenance, so forward failure forecasts through updated models using the latest equipment history, will help initiate investigations into root cause and/or optimisation studies. It is these activities that help continue to drive down the corrective maintenance in a deliberate manner. For those reasons the earlier you start making decisions according to dynamic models rather than static experience or subjective based decisions, the sooner you can start making improvements. However, it is never too late to start. The start point just determines how much history (read failures) you have to build your models. Read More →

ARMS Reliability are getting ready for the annual AMPEAK conference this June. ARMS will be presenting and exhibiting at this event.

Michael Moulton, Software Sales & Training

The AMPEAK conference has been organized by the Asset Management Council Australia.  Over 4 days, 300+ asset management and maintenance professionals from Australia and around the globe will receive access to the most up-to-date asset management information across a broad range of industries via presentations, tutorials and workshops.

WA Lead Engineer, Weylon Malek, will be presenting ‘Production Reliability Analysis to Improve Asset Management’, and Apollo Trainer and Facilitator, Jack Jager, will be presenting ‘6 Critical Steps for Facilitating a Successful Root Cause Analysis.

You can see Weylon’s presentation on Tuesday, 3rd June at 1.30pm, and Jack’s presentation on Thursday 5th June at 11.30am.

Don’t forget to stop by the Exhibit area where you can see Michael Moulton at the ARMS booth.

Conference Details:
Perth, WA | 2-5 June | Crown Convention Centre

AMPEAK Conference Website