ARMS Reliability’s CEO, Mick Drew, recently attended the COO Leaders Resources Summit held on the Gold Coast. The summit included a number of the resource sectors most influential executive management and operators, and was two days of corporate and management level discussions concerning some of the most important issues currently facing the Australian resources sector.

Mick Drew said, “It was great opportunity to engage in frank discussions with some of the industry’s leaders. They told us more about their unique challenges, and what issues are impacting their companies.” Mick continued, “They wanted to understand more how we can use our expertise and experience to make a positive change to their maintenance practices, asset management and bottom line.”

From the various panel discussions, workshops, and one-on-one meetings, it was clear there is high demand for experts in Reliability & Maintenance who are able to offer guidance and clarity around the key challenges and issues facing the mining industry. Read More →

Become part of a vibrant community that will share knowledge, experience and innovation.

Mainstream ConferenceMainstream 2014 kicks off in Perth on Monday, 12th May. It’s an exciting event where Asset Management leaders and teams come together to share knowledge, experience and innovation.

Mainstream is different to other asset management conferences; it’s an interactive experience rather than a sit-and-listen event. With 40+ sessions, workshops, roundtables, panel discussions and live interviews, you’re sure to find a session that will be of interest to you. Read More →

arms_ebook-6-steps-efficienciesThe manufacturing industry is under immense pressure. Globalisation and increased competition, coupled with a more demanding consumer base, force manufacturers to seek new ways to boost the bottom line.

To improve ROI and respond to customer demands for faster supply at a lower cost, many companies are required to run their manufacturing plants 24 hours a day, 7 days a week. They are squeezing every last drop of availability and capacity from their assets. Read More →

By Jack Jager

How often have you looked at corrective actions and thought that they would have little, if any impact in preventing the problem from reoccurring? It wasn’t just once…. and it continues to happen.

The Question is Why?

Ypointing finger 300x199et the answer is not a simple or straight forward one.  Do we believe that the person(s) creating these corrective actions aren’t trying to do their best? No, I don’t think so. I firmly believe that almost all people are trying to do their best. So where does that leave us?

I think that we are caught up in a system where the reactive, quick fixes are the goal, the way of dealing with incidents on a day to day basis. If you were to have a downtime incident and you were  to bring the  power  back on quickly after an outage, or the machine is back in operation after a short space of time, then the reaction from the management group and from all of your peers is typically….”Well done! Great job!”  A pat on the back for those who have performed the job well.  In other words we give respect and accolades to those who can fix it quickly.  Conversely there is often little reward or acknowledgement for hours of diligent work in the pursuit of actions that will resolve the issue once and for all. We reinforce the quick fixes.
Now don’t get me wrong here because the ability to do the quick fix is and always will be a valuable skill, but the real challenge is to understand whether we have prevented the problem form reoccurring?

What happens after the initial fix is put into place? Where do you go to from there? In the completely reactive model, the fire-fighting model,  where breakdown maintenance often takes precedence over planned maintenance (which then sets you up for the next round of failures), there is always a fire that needs tending, so we will typically tend to jump to that fire, to the next problem on the list. “I have dealt with that one, what’s next?”

The Blame Game

From my conversations with people who attend the courses that I present covering the Apollo Root Cause Analysis methodology, something else becomes blatantly clear. We still seem, on many different levels, to be playing the “blame game”.  The question of “who” still seems to be of paramount importance to some, perhaps many people.  The question I would put forward to these people is “Will knowing who did it, stop it from happening again?” Now to my way of thinking by far the most common answer to this question will be “No”(although there are exceptions). So why do we feel that we need to focus on the “who”? If the goal of doing Root Cause Analysis is to prevent recurrence of the problem the challenge lies not so much in who was involved but rather emphasising, or focusing, on what you can do to stop it from happening again. This focus will lead to gathering more factual information which is the essence of understanding the problem first and foremost.

The “who” side of the question is pretty easy to determine, but if that is what we focus on then it is likely to limit thorough questioning,  and leads quickly and easily down a blame path. Sanctions are given or jobs lost, all based on the knowledge of “who” was at fault. But where does this lead? Wouldn’t this lead to a lack of reporting mistakes or faults as there will be unwanted consequences because of the report? Doesn’t it elevate risk as there would now be a culture of hiding or covering up mistakes? When you ask questions, what are likely to get? The truth?

Something else to consider is whether people intend to cause damage, create failures, injure themselves or hurt others? Again the overwhelming answer is still “NO”.  That people are often involved in many incidents, and make mistakes, is seemingly the constant part of the equation. But that is the nature of the beast. People are fallible, they do make mistakes and no matter how hard we try to control this aspect, the “human error” side of causes, it is forever doomed to failure. If we rely on trying to control people then our solutions will have no certainty in their outcome. Going down this path is simply not reliable.

Hierarchy of Control

This is echoed in the concept of the “Hierarchy of control” where corrective actions are placed within the Hierarchy, as being either a form of Elimination, Substitution, Engineering,  Administrative or P.P.E.  controls.

The first three of these are perceived to be very strong controls, or hard controls, with almost guaranteed, reliable, consistent results. They are however more time consuming and typically involve spending money to achieve your desired outcome. Administrative controls or the use of PPE as a form of control are perceived to be soft controls. They are relatively quick to implement and don’t cost too much and yet if you were to ask the question “will they prevent recurrence”, almost universally the response will be “NO”!

They may however satisfy the need to report.  I have “ticked the box” and created a perception of having done something about the incident. To take this a step further these “soft options”, now get signed off by management who are fully cognisant of the “Hierarchy of control”. If we keep taking the soft options however is it any wonder that we are still “fire fighting”. If we don’t fundamentally change or control causes that create the  problem then the problem still has an ability to happen again, regardless of the “who”, the person involved. This could be anyone.

Creating another Procedure

How often have you heard or seen, as a response to a problem …….”create another procedure”? Would you be certain that this will prevent recurrence of the problem? It could be said that you have tried to control the problem. You can certainly show that you have done something. Would it however be defensible in a court of law if someone were to subsequently get hurt? If you expect someone to remember every single procedure, of every single task, of the many tasks that they need to perform in every single day, is this feasible? And we all know it is a soft control! An administrative one. So do the courts.

The Argument about Sanctions…

Who learns the most from the mistakes that are made? Isn’t it the person or the people involved? This was put into perspective for me by another Apollo instructor at a conference in Indianapolis. He said to me “if someone makes a mistake for instance and the cost of that mistake might be say $500,000, and you are so angered by this that you then sack the person who made the mistake (quite possible, even probable)……it is like sending someone on a $500,000 training course and then sacking them the next day”.

Does this make any sense?


What have you learned from conducting an RCA? Do you have any successful tips or feedback woth sharing or discussing? We look forward to reading your feedback via comments below or let’s connect on our LinkedIn Group – ARMS Reliability – Apollo Root Cause Analysis for further discussion.


By Amir Datoo, Senior Reliability Engineer

The power industry is investing heavily in new technologies to harvest power from renewable sources like wind, solar and hydro. Yet with these new technologies come massive maintenance costs – if strategies are not put in place from the outset.

As we march further into the 21st century, the power sector is undergoing a massive shift. With climate change high on the global agenda, the industry as a whole is committed to finding alternatives to conventional fossil fuel power generation. Low-carbon power sources like wind, solar and hydro are being pursued by even the most traditional power companies. Read More →

By Joel Smeby, Senior Reliability Engineer, and Michael Drew, Managing Director, ARMS Reliability.

It’s a common phrase and one that is thrown around often.  But what does it really mean to have an optimised strategy?  If someone asks if your strategies have been optimised, can you answer with a resounding ‘Yes!’ and explain exactly what that means?

An optimised maintenance strategy means that your equipment is being maintained and operated at the lowest possible cost with respect to labour, spare parts, equipment, and failure effects.   Failure effects may consider cost of downtime, safety and environmental considerations, or operational impact.  In these cases it means that your facility is being maintained and operated in a way that is within your corporate risk thresholds, meets operational goals and has the lowest overall costs. Read More →

Landmark product release puts users of Isograph’s Availability Workbench™ just one click away from saving even more time, energy and cost in their asset management and maintenance programs.

Global Reliability and Asset Management consulting firm, ARMS Reliability has released the Reliability Integration Tool – a powerful new software tool with global application in the resource, utilities, power and transport industries.

The Reliability Integration Tool™ equips reliability engineers and asset managers with the power to seamlessly upload and download data between Isograph’s Availability Workbench and their CMMS system. Read More →

Some successful implementations of Continuous Improvement (“CI”) use the approach known as Kaizen (#1). One of the core principles of Kaizen is self-reflection of processes, which is also known as “Feedback”. The purpose of CIP is the identification, reduction, and elimination of suboptimal processes in other words is to become Efficient. Becoming efficient is achieved through incremental steps or evolutionary change (#2) if you follow Kaizen.

The purpose of this article is to introduce how Availability WorkbenchTM (“AWB”) can be used to achieve each of the three Kaizen aspects of Continuous Improvement namely, Feedback, Efficiency and Evolutionary change. Firstly we begin with Feedback.

Read More →

describe the image

When an incident or accident occurs at your workplace, what do you do to fix the problem?

In many cases, the “5 Whys process” is a proven and accepted means to get to the root cause of the incident. But what do you do if this technique doesn’t dive deep enough – and only presents further symptoms rather than the real cause or, indeed, causes?

Ths eBook reveals the benefits and limitations of the 5 Whys process; and then presents a useful method for taking the analysis further.

Get My Copy

By Ned Callahan

Everybody agrees, don’t they, that the whole point of the investigation of safety incidents, whether injuries have actually been suffered or the potential for them was high is to prevent their recurrence? Regrettably, the tendency to blame is more apparent in these cases than in mechanical failures or supply chain deviations, for example, presumably because of the deeper emotional responses from the affected parties.

Tblog RCA health and safetyhe significance of the particular event can then be intensified because the variety and depth of the participants’ emotional responses are undeniably “real” and can, if not appropriately accommodated in the total incident management process, cloud the judgement of the investigator/s and even complicate the task for the team of analysts assembled for the RCA.  Minimising the risk of friction, avoiding undue “heat” being generated by the harm (nearly) caused, can be achieved by the prompt application of an investigation process which both encourages and relies upon the frank sharing of information in order to achieve the agreed objective.

A mature business will have a risk matrix which pre-determines the level at which the investigation is undertaken and therefore, which “tool” or methodology may be prescribed for the particular event. The previous deliberations about which method to use for what level/type of event will have been influenced by the organisation’s previous analysis history, incorporating the relative success or otherwise of previous investigations. These results will have been generated by multiple factors such as the quality of evidence, determined by the care taken in its collection and preservation, the rigour of the facilitation process, the relative “influence” of stakeholders and significantly, the co-operation of the incident actors, being the victim/s and witness/es.

An event, being the first of its type in the organisation, with a very minor injury and no time lost may only require a “trouble-shooting” type approach. The expectations of regulatory authorities in hazardous industries can be another influence on the choice.
But then all that experience, positive, negative or mixed can be neutralised by the emergence of a different principal with responsibility for the RCA process who has experience of another method or specific training and expertise and has the clout to sway the choice. It may well be simply based on a personal preference arising from familiarity rather than an objective assessment of alternatives.

Regardless of the methodology selected, the purpose must be to prevent recurrence and not to blame. If the investigation focuses primarily on “who” did or did not do something or other, the tenor of the subsequent analysis may become negative and the opportunity to really learn from the experience will be subordinate to the search for a culprit. By the way, this “no blame” attitude does not exempt personnel who are repeatedly and wilfully negligent in the performance of their duties or associated activities in the workplace. The owners have a duty of care to provide a workplace for all and if misbehaviours increase the probability of increased risk of harm they are obliged to respond. Reprimand is a reasonable sanction. Or, in the most severe but rare cases, dismissal might be reasonably justified. The justification would be the thorough, objective analysis. Otherwise the organisation could find itself liable to unfair dismissal or similar charges.

The need for objectivity cannot be over-stated and explains why best practice for significant events is to engage a third party facilitator who has no “skin in the game”. If the broad business context for deep analysis is Continuous Improvement, the enhanced safety of the workplace and all processes and equipment operations used by its employees must be the outcome.

Keeping in mind that every event is unique in some respects – the most obvious being that it happened at a different time to every other one (you know of) – the purpose of the RCA is to discover what is different or distinctive about this event. What are the other unique causes which might be effectively controlled or negated in order to significantly reduce the likelihood of a repetition or similar occurrence?

So, after the exhaustive process has been followed, with the facts associated with the incident having been recorded, the consequences measured and documented, the timeline and sequence of events mapped, any cans of worms expertly opened and explored, you have discovered a number of causes. Typically and ideally, you will have discovered causes of which you were ignorant at the beginning of the analysis. And these will only be discovered if the event is sliced thinly, if every phase is considered very carefully. These ought to be documented in some graphical form so that the team’s understanding of the event can be shared and agreed as complete. The cause and effect chart or tree is the most common display form employed and there needs to be provision for the display of the pertinent evidence for each cause.

It is imperative that all of the causes are revealed before you can be confident that prevention is assured. Being persistent in the quest for causes is a very desirable trait. Don’t stop too soon. Then, the existence of clearly defined relationships between the causes and their effects will provide the clarity necessary to instil confidence that the consequent solutions will be effective. It is the solutions, targeting specific causes, which combine to assure prevention, or at least, serious mitigation of the consequences.

But the job is incomplete. The solutions need to be implemented in a timely fashion to have an effect on the probability of recurrence. If, for example, one of the causes is the failure of some mechanism then identifying a solution for that may also entail deeper investigation to determine other failure modes which could have similar, potentially harmful effects. Note however that the investigation is not per se a solution even though it may provide data which leads one to alternative or complementary solutions.

Establishing the priorities for that implementation, giving ownership and due dates for completion are the closure everybody needs. It will be a learning experience for all intimately concerned but can and should be shared more widely in a large organisation. Nobody disagrees with a safe workplace and that attitude will reflect well on the organisation and community regard may well be heightened.  A safe workplace also reduces the likelihood of interruptions to business and therefore this increased reliability will strengthen relationships with customers and suppliers alike long-term. 

training footer ad resized 600