Author: Philip Sage, CMRP, CRL

Traditionally, SAP is populated with Master Data with no real consideration of future reliability improvement. Only once that maintenance is actually being executed does the real pressure of any under performing assets drive the consideration of the reliability strategy. At that point the mechanics of what’s required for ongoing reliability improvement, based upon the SAP Master Data structure, is exposed and, quite typically, almost unviable. ???????????????????????????????????????????????????????????????????????????

The EAM system is meant to support reliability. Getting your EAM system to support reliability requires some firm understanding of what must happen. If we look a little closer at reliability and the phases of life of an asset, we can see why the EAM settings must vary and not be fixed.

The initial reliability performance of any system is actually determined by its design and component selection.

This is probably not a big surprise for anyone close to reliability, but it may spark some debate from those who have not heard this before.

As evidence to support this statement, a newly commissioned and debugged system should operate nearly failure free for an initial period of time and only become affected by chance failures on some components. An even closer inspection can show that during this period, we can expect that most wear out failures would be absent after a new machine or system is placed into service. During this “honeymoon period” preventative replacement is actually not necessary nor would an inspection strategy provide benefit until such time as wear (or unpredictable wear) raises the possibility of a failure. Within this honeymoon period the components of the system behave exponentially and fail due to their individual chance failures only. They should only be replaced if they actually fail and not because of some schedule. Minor lubrication or service might be required, but during this initial period, the system is predominantly maintenance free and largely free from failure.

Here is where the first hurdle occurs.

After the initial period of service has passed, then it is reasonable to expect both predictable and unpredictable forms of wear out failures to gradually occur and increase in rate, as more components reach their first wear out time.

Now if repair maintenance (fixing failures) is the only strategy practiced, then the system failure rate would be driven by the sporadic arrivals of the component wear out failures, which will predictably rise rather drastically, then fluctuate wildly resulting in “good” days followed by “bad” days. The system failure rate driven by component wear out failures, would finally settle to a comparatively high random failure rate, predominantly caused by the wear out of components then occurring in an asynchronous manner.

With a practice heavily dependent upon repair maintenance, the strength of the storeroom becomes critical, as it makes or breaks the system availability which can only be maintained by fast and efficient firefighting repairs. The speed at which corrective repairs can be actioned and the logistical delays encountered, drive the system availability performance.

From this environment, “maintenance heroes” are born.

As the initial honeymoon period passes, the overall reliability the system becomes a function of the maintenance policy, i.e. the overhaul, parts replacement, and inspection schedules.

The primary role of the EAM is to manage these schedules.

The reduction or elimination of predictable failures is meant to be managed through preventative maintenance tasks, housed inside the EAM that counter wear out failures. Scheduled inspections help to counter the unpredictable failure patterns of other components.

If the EAM is properly configured for reliability, there is a tremendous difference in the reliability of a system. The system reliability becomes a function of whether or not preventative maintenance is practiced or “only run to failure then repair” maintenance is practiced. As a hint: the industry wide belief is that some form of preventative practice is better than none at all.

Preventative maintenance is defined as the practice that prevents the wear failure by preemptively replacing, discarding or performing an overhaul to “prevent” failure.  For long life systems the concept revolves around making a minimal repair that is made by replacement of the failed component, and resulting in the system then restored to service in “like new” condition. Repair maintenance was defined as a strategy that waits until the component in the system fails during the system’s operation.

If the EAM is not programmed correctly or if the preventative tasks are not actioned, then the reliability of a system can fall to ridiculously low levels, where random failures of components of the recoverable system, plague the performance and start the death spiral into full reactive maintenance.

This is quite costly, as in order to be marginally effective the additional requirement is a fully stocked storeroom, which raises the inventory carry costs. Without a well-stocked storeroom, there are additional logistical delays associated with each component, that are additive in their impact on the system availability, and the system uptime, and so system availability becomes a function of spare parts.

An ounce of prevention goes a long way.

Perhaps everything should be put on a PM schedule…? This is actually the old school approach, and I find it still exists in practice all over the world.

The reliability of a system is an unknown hazard and is affected by the relative timing of the preventative task. This timing comes from the EAM in the form of a work order which is supposed to be generated relative to the wear out of the component. How well this task aligns with reality is quite important. If the preventative work order produced by the EAM system comes out at the wrong time, there is a direct adverse effect on system reliability.

EAM systems are particularly good at forecasting the due date of the next work order and creating a work order to combat a component wear out failure. However, wear is not always easily predicted by the EAM and so we see in practice, that not all EAM generated work orders suppress the wear out failures. One reason for this variance is the EAM system work order was produced based on the system calendar time base along with a programmed periodicity that was established in the past to predict the future wear performance.

We don’t always get this right.

As a result we generate work orders for work that is not required, or work that should have been performed before the component failed, not just after the component failed.

Maybe this sounds familiar?

Calendar based forecasts assume wear is constant with time. It is not.

A metric based in operating hours is often a more complete and precise predictor of a future failure. It’s true most EAM systems today allow predictable work to be actioned and released by either calendar time or operating hours and allow other types of time indexed counters to trigger PM work orders.

A key to success is producing the work order just ahead of the period of increased risk to failure due to wear. Whether by calendar or some other counter we call the anticipation of failure, and the work order to combat it, the traditional view of maintenance.


This all sounds simple enough.

The basic job of a reliability engineer is to figure out when something will likely fail based on its past performance and schedule a repair or part change. The EAM functionality is used to produce a work order ahead of the failure, and if that work is performed on-time, we should then operate the system with high reliability.

The reliability side of this conjecture, when combined with an EAM to support, is problematic.

If the work order is either ill-timed from the EAM or not performed on time during the maintenance work execution, there is an increased finite probability that the preventative task will not succeed in its purpose to prevent a failure. Equally devastating, if the PM schedule is poorly aligned or poorly actioned, the general result mirrors the performance expected from a repair maintenance policy, and the system can decay into a ridiculously low level of reliability, with near constant sporadic wear out of one of the many components within the system.

When preventative maintenance is properly practiced so that it embraces all components known to be subject to wear out, a repairable system can operate at high reliability and availability with a very low “pure chance” failure rate and do so for indefinitely long periods of time.

Determining what to put into the EAM is really where the game begins.


MASTERING ENTERPRISE ASSET MANAGEMENT WITH SAP, 23-26 October 0216, Crown Promenade, Melbourne

Phil Sage will be running a full day workshop “Using SAP with Centralised Planning to Continually Improve RCM Derived Maintenance Strategies” Wednesday 26 October

Come learn what works, and what does not work, as you integrate SAP EAM to support your reliability and excellence initiatives, which are needed to be best in class in asset management. The workshop covers how and where these tools fit into an integrated SAP framework, what is required to make the process work, and the key links between reliability excellence, failure management and work execution using SAP PM.

This question came up during one of our most recent webinars and we thought it raised a very interesting point. Joel Smeby is an experienced reliability engineer who leads our North American engineering team and has helped implement reliability initiatives in many different organizations across a variety of industries. ???????????????????????????????????????????

Here is what Joel had to say about the role of a reliability team as it relates to calculating the cost of downtime:

Reliability is typically not directly responsible for production. But when you look at all of the different areas within an organization (purchasing, spare parts, warehouse, operations, maintenance, safety), Reliability is the one area that should stand across all of them.  The organizational structure may not necessarily be set up in that way, but in terms of being able to talk to people in maintenance, operations, or purchasing and leverage all of that information into a detailed analysis and then make decisions at that level – I think it is Reliability that needs to do that.

I recently worked on a site and went to the operations department to validate their cost of downtime and they weren’t able to give us a solid number. It changed from day to day or week to week and from an organizational perspective it’s very difficult to make decisions based on data when you haven’t defined that number.  As Reliability Engineers we need that downtime number to justify holding spare parts or performing preventive/predictive maintenance tasks.  If Operations has not defined that then I think that a Reliability Engineer is the perfect person to facilitate that discussion.  It can sometimes be a difficult conversation to have, especially if you’re gathering the information from people in upper management.  One strategy is to help people understand why you’re gathering that information and how it will be used.  Justifying maintenance and reliability decisions is all about balancing the cost of performing maintenance against the cost of downtime in order to get the lowest overall cost of ownership.  The managers who have a budget responsibility that includes both maintenance and operations will typically appreciate this approach in finding the lowest cost to the organization.

Some organizations are able to determine the cost of downtime as a $/hour.  This is done in the most basic sense by taking the annual profit that the equipment is responsible for and dividing by the number of hours the equipment runs each year (8,760 hours for continuous operation).  A deeper level of analysis may be required in more complex operations such as batch processes.

The traditional view of a maintenance strategy is that the level of effort put in to preventing a failure is dependent on the type and size of equipment.  The reliability based approach understands the cost of downtime, and therefore the equipment’s importance.  This enables the maintenance strategy to be optimized to the overall lowest cost for the organization.

Join the conversation in our reliability discussion group on LinkedIn

The Age of Renewables 


Click to download

Landsvirkjun is Iceland’s largest producer of electricity, and one of the 10 largest renewable energy companies in Europe. Its power infrastructure is ranked among the World’s best and most reliable—an important competitive advantage that allows the company to attract and retain industrial clients like Alcoa, Rio Tinto Alcan and others. With its asset base both growing and aging, Landsvirkjun was outgrowing its existing asset management systems and needed a more robust approach to investment decision making and long-term planning.

In this case study from the December 2015 issue of Assets magazine, ARMS Reliability’s partner in Asset Investment Planning and Management—Copperleaf Technologies—describes the journey the company took to implement C55, and the benefits they’ve achieved.


ARMS Reliability and Copperleaf Technologies are partners in delivering asset intensive industries in the Australian and New Zealand Markets with cutting edge solutions in the area of Asset Investment Planning and Management (AIPM).  Under this partnership agreement, ARMS Reliability acts as the distributor for Copperleaf’s AIPM solution, C55, and provides implementation services and on-going support for the C55 product in the ANZ region. 

Click here for more information about Copperleaf and C55.


By: Gary Tyne CMRP, CRL

Engineering Manager – ARMS Reliability Europe

Working for a global organization has taken me to some weird and wonderful places around the world. Different cultures, traditions, religions and people certainly enlightens you to the wonderful and colorful place we all call home.

I would say in most of these countries I have at some stage taken a taxi or at least been chauffeured by a driver in a customer’s company vehicle. These experiences have led to some interesting conversations on life, travel, politics, and football with some very knowledgeable and diverse taxi drivers. On the other hand, I have had drivers that have not spoken a word and have just delivered me to my destination in silence, even after trying to engage in conversation, their chosen dialogue is nil speak. bigstock--131191391

A recent taxi encounter occurred when I had just left my customer and was going to call for a taxi, when I spotted someone being dropped off at my current location. I asked the driver if he could take me to Dublin airport and he obliged.

This is when I met Mohammed, an immigrant from Kenya who had moved to Ireland 17 years ago. He was smiling and cheerful and had a generally happy persona about him. We discussed weather in Ireland versus Mombasa, we mentioned football briefly, and then we started to discuss cars. This occurred when a brand new Mercedes went past us in the fast lane and I passed comment on what a beautiful car that was.

Mohammed started to discuss the Toyota Corolla in which we were driving and how he loved his car for its level of reliability. I asked how many miles his vehicle had driven and he pointed out that he had covered over 300,000 miles since he purchased the car brand new in Northern Ireland. He went onto explain how he ensured that it was regularly maintained to a high standard with the best quality oil and original OEM parts being used when any replacements were required. The engine and gearbox were original and providing ‘you look after your car, it will look after you.’ Mohammed was proud of the length of service he had achieved from his vehicle and that the car had never let him down. However, as the vehicle operator he recognized the importance of regular maintenance and the use of the right quality parts. He also said that he only allowed one mechanic to work on his vehicle because he was very skilled and competent at his job and could not trust others to do work on his taxi.

Mohammed was also proud to be a taxi driver in Ireland and combined with his ‘Reliability’ story certainly made the trip to Dublin airport a memorable one. Mohammed did not know my job role and that I had spent over 30 years in Maintenance and Reliability, but he gave me a text book account of what is ‘Reliability’! I said goodbye to Mohammed after he let me take a picture of his mileage and car. I wished him luck and many more years of happy motoring in his reliable Toyota motor vehicle.

Sitting in the departure lounge my trip to the airport and conversation with Mohammed certainly made me think: mileage

  • Do we see this level of passion and ownership amongst today’s industrial operators?
  • Should Operators take more care for their assets, ensuring high reliability through a program of basic care?
  • How do we ensure the right levels of competence in our technicians?
  • How do we ensure that the correct specification and quality of parts are being purchased?
  • How do we ensure that maintenance is being performed at the right frequency on the right asset?

This ‘Reliability Tale from the Taxi’ may have also generated further questions in your own mind, for me, it provided me with  another great ‘Reliability’ story that I can share during one of our global reliability training courses.


As its name suggests, an “asset” is a useful or valuable thing. Indeed, the antonym of “asset” is “liability”. Hence, an organization’s assets should deliver value; not cost money. With the right techniques and strategies in place, asset managers can ensure that their plant and equipment is performing at and being maintained at optimum levels. These many and varied techniques can be applied across the different phases of an asset’s life to ensure that,  instead of draining money from the bottom line, it actively contributes to margin increases. F

Managed the right way, assets can contribute significantly to profit margins. It takes a strategic approach to maintenance and asset management, in key areas such as:

  1. Increasing availability and plant capacity
  2. Reducing unnecessary maintenance costs
  3. Reducing unnecessary spares holding costs
  4. Planning optimum retirement of plant and equipment

Once you determine a key focus area, it’s important to apply the right technique.

Margin Increase Techniques

System Analysis

The primary objective of System Analysis is to identify and eliminate bottlenecks in a system, and is particularly useful in complex operations where the contribution of different parts of the system are not clear. An analyst performing System Analysis builds a representative model using reliability block diagrams, and runs a simulation to produce a quantitative view of the contribution of all parts of a system. The technique is used to assess the reliability of individual components and their dependencies on other events or assets in order to assess the overall availability of the system. This helps to determine the importance of each element, so that the analyst can play “what if” with different levels of redundancy, size of buffers, maintenance strategies, and spares holding levels, in order to find the optimum.

Maintenance Benefit Analysis

Unfortunately, there has been a long tradition of organizations fostering a culture of maintenance in which the maintenance crews are lauded as heroes when they step in to fix things that are broken. In such cultures, preventative maintenance is less appreciated, despite it being proven to save money. Maintenance Benefit Analysis – similar to Maintenance Optimization– is used to evaluate a maintenance plan and identify any areas where maintenance is either not needed or is not optimal. A Maintenance Benefit Analysis is used to identify where alternatives to current practice can be improved by choosing a different type of strategy or frequency.

Spares Optimization

Typically, maintenance crews love spares and want lots of them in their plant or facility. Yet plant managers resent having too many spares in stock as they tie up capital and take up storage space. Spares Optimization is all about finding the optimum level of spares to hold; a level that balances the cost of not having spares available against the cost of holding the spares in stock.

Repair vs Replace Analysis

Knowing when to replace a piece of equipment shouldn’t be guesswork, as the right time to replace can save hundreds of thousands of dollars in repairs. Repair vs Replace Analysis is used to predict or track the costs of repairs against the cost of replacement. As the cost of repairs increases (which incorporates costs like labor and parts), it becomes less viable to maintain the asset. Plus, as the cost of new equipment falls, it becomes more viable to buy it new. Life Cycle Cost analysis can be applied to assess the optimum point to switch from repair-mode to replace-mode.

ARMS Reliability can show you how to achieve great cost savings and margin increases across the whole organization by using these techniques and their associated software tools; and will train your team to implement and manage these changes proactively.

In most cases, there is much to gain by working through maintenance strategy optimization. To identify where your company’s maintenance strategy sits on the spectrum, you can perform a simple self-assessment that looks for the most common symptoms, which are described in detail in our guide “5 Symptoms Your Maintenance Strategy Needs Optimizing.” If the symptoms are evident, then there is a strong business case to invest in maintenance strategy optimization. The primary question in diagnosing the health of your maintenance strategy is a simple one. Does your maintenance strategy need optimizing? Ideally, your maintenance strategy is already optimized. Perhaps it was, but is in need of a tune-up. Or, as is the case in many companies, maybe you are experiencing endemic symptoms that lead to: M

  • Recurring problems with equipment.
  • Budget blow-outs from costly fixes to broken equipment.
  • Unplanned downtime that has a flow-on effect on production.
  • Using equipment that is not performing at 100 percent.
  • Risk of safety and environmental incidents.
  • Risk of catastrophic failure and major events.

To identify where your company’s maintenance strategy sits on the spectrum, you can perform a simple self-assessment that looks for the most common symptoms.

  1. Increase in unplanned maintenance – A sure sign that your maintenance strategy is not working is the simple fact that you are performing more unplanned maintenance, which is caused by an increase in the occurrence of breakdowns.
  2.  Rising maintenance costs – In companies that apply best practice maintenance strategy optimization, total maintenance costs are flat or slightly decreasing month-on-month. These optimized strategies combine preventative tasks with various inspection and root cause elimination tasks which in turn produces the lowest cost solution.
  3. Excessive variation in output – A simple definition of the reliability of any process is that it does the same thing every day. In other words, equipment should run at nameplate capacity day in and day out. When it doesn’t, this is an indication that some portion of the maintenance strategy is misaligned and not fully effective.
  4. Strategy sticks to OEM recommendation -Sticking to the maintenance schedule prescribed by Original Equipment Manufacturers (OEMs) may seem like a good starting point for new equipment. But it’s only that a starting point. There are many reasons why you should create your own optimized maintenance strategy soon after implementation.
  5. An inconsistent approach – Consistency implies lack of deviation. And this implies standardisation. When it comes to maintenance strategies, standardization is essential.

For an in depth look at these symptoms download the complete guide “5 Symptoms Your Maintenance Strategy Needs Optimizing” 

If your maintenance activities have a large proportion of reactive repairs then the costs of maintaining your assets are larger than they need to be, because the cost of performing unplanned maintenance is typically three times the cost of performing maintenance in a planned manner. Furthermore, if your system is reactive, it is a sign that you are not managing failures. Your biggest costs may be catastrophic failure, systemic failure or equipment defects.Proactive x Reactive creative sign with clouds as the background

These major meltdowns or one-off events can cost millions of dollars in reactive repairs, lost production and/or major safety/environmental impacts. If you need to lower the cost of maintenance this is an area you can make a significant impact on the P&L.

Proactive maintenance – which is aimed at avoiding such scenarios – is a much more cost-effective approach.

First, what is reactive maintenance? Put simply, it is any maintenance or repair done to a piece of equipment after a failure event. If a gear-box grinds to a halt and your maintenance team rushes to repair it, they are engaging in reactive maintenance.

While the immediate cost of such maintenance may seem low – a day of labor and the purchase of a new part for the machine – the flow-on costs associated with downtime, lost production can be much higher and there is a greater risk of safety and environmental incidents during the shutting down or starting up of equipment.

In companies where reactive maintenance is a large proportion of work performed, there are many hidden costs carried by the business such as higher inventories; premium rates for purchasing spare parts; higher stocking levels for critical spares; more wasted time queuing for tools, materials, and labor; higher overtime levels; more plant downtime; interruption to customer orders; stockouts; offspec quality.  The organization and management system has a short term, busy focus often under budget pressure, variations in production, and lots of “things to do”.


On the flip side, proactive maintenance takes a preventative approach. It involves making assets work more efficiently and effectively so that downtime and unexpected failures become a thing of the past. It’s also about trimming unnecessary expenditure from asset management budgets. From a bottom line perspective, it’s about boosting the assets’ contribution to earnings before interest and tax (EBIT).

Strategies associated with proactive maintenance involve understanding and managing the likelihood of failures, some of the common analytical methods to understand the impact of failures on the business include:

  • System Analysis – to understand the way equipment failures can impact the availability and production capacity of a system; it allows the analyst to identify and eliminate potential bottlenecks in a system, and thus increase plant capacity
  • Criticality Analysis – to rank equipment by the likelihood and severity of failure impact on key business objectives, so you can then channel maintenance resources into the more critical pieces of equipment
  • Maintenance Benefit Analysis – to evaluate a maintenance plan and identify areas where maintenance is either not needed or not optimal.
  • Spares Optimization – to find the optimum level of spares to hold in-stock, which balances the cost of not having spares available versus taking up storage space on-site
  • Repair Vs Replace Analysis – to predict or track the cost of repairs against the cost of replacement, so it becomes clear when to replace assets for best value
  • Root Cause Analysis – to analyze the root cause of failures and focus resources on eliminating their reoccurrence, not just fixing the symptoms time and time again.
  • Vulnerability Analysis- to systematically review all aspect of the operation in a way to discover tomorrow’s failure, so it can be eliminated in a planned fashion.

As these strategies attest, proactive maintenance is about much more than building a schedule of ongoing maintenance tasks. By understanding and managing failure the maintenance resources can be directed to those areas that require attention in a planned manner, you can actually save significant amounts of money into the long term.

And, above all, it is important to remember that a culture of reactive maintenance is not ideal. In fact, unplanned reactive maintenance is one of the key symptoms that your maintenance strategy isn’t working.

Learn more by downloading our guide: 5 Symptoms Your Maintenance Strategy Needs Optimizing

Certification is the term applied to the process whereby an individual voluntarily submits his/her credentials for review based upon clearly identified competencies, criteria, or standards. The primary purpose of certification is to ensure that the personnel employed meets high standards of performance set out for that role by the certifying authority. TrainingIcon

Certifications demonstrate to employers and/or clients that you are, indeed, an expert in a particular area or areas, and that a reputable, recognizable organization, The Association of Asset Management Professionals is willing to attest to that.

The body of knowledge lays out the required skill and knowledge each Certified Reliability Leader™ (CRL) must possess in order to be certified. The CRL is unique in that it certifies an individual across 29 subjects in 5 inclusive domains and in two key respects, namely their leadership and their reliability expertise. These 29 concepts are embodied in the form a complete body of knowledge with color coded Uptime® Elements™ that span far beyond the typical roles of a maintenance and reliability organization. elements

The breadth of the CRL certification makes it difficult. There are some areas that not everyone has the requisite experience or knowledge, and these represent learning milestones. The first step to becoming an excellent leader is the willingness to learn and lead. It takes knowledge, practice with feedback, and passion for professional development. The CRL is an experienced based journey. It is a journey many of us are familiar with, as the most common functions of reliability are well known and generally understood by most maintenance managers and reliability professionals.

These areas most often associated with this discipline include:

  • Preventative Maintenance (PM)
  • Reliability Engineering
  • Reliability Centered Maintenance
  • Planning and Scheduling Work
  • Computerized Maintenance Management Systems
  • Lubrication, Minor cleaning, and Servicing.

In this regard, the CRL Certification provides an objective measure of the reliability professional’s expertise. An individual who is certified is well qualified, and their particular qualification has been measured voluntarily by their action in sitting for the certification exam.

The predictive technologies could easily serve to form another group which could include:

  • Vibration Analysis
  • Oil Analysis
  • Ultrasonic testing
  • Infrared Thermography
  • Motor Circuit Analysis
  • Alignment and Balancing
  • Non-Destructive Testing

What makes the CRL certification unique is the LEADERSHIP certification component. This is an area we seldom associate with the reliability disciplines or perhaps find these qualities more often listed as qualifiers for the maintenance management job function.

The leadership certification and body of knowledge containing broad leadership skills is what makes the CRL different, and is what sets the CRL apart from other technical certifications. The CRL is not only a technical certification process, it is also a formidable leadership certification process. This is very important in today’s global economy, as leaders are necessary at every level.

Imagine the value a well-qualified individual could bring to your organization if they were knowledgeable and capable of operating in these circles:

  • Human Capital Management
  • Integrity
  • Competency Based Learning
  • Executive Sponsorship
  • Operational Excellence
  • Operator Driven Reliability
  • Defect Elimination

Now imagine they also know a thing or two about reliability!

The leadership elements arguably would not be the first qualifications most HR managers would associate with a maintenance and reliability role, but it is these leadership qualities that deliver value back into an organization. Those who are familiar with the CRL have noted that the benefits extend beyond just the individual who was certified, and are passed on to the team members they interact with, and the organization as a whole.

Working on leadership makes sense at every level of an organization. Certifying your leaders sets you apart.

ARMS Reliability provides Reliability Leadership training as well as the Certified Reliability Leader™ (CRL) exam.

LEARN MORE about upcoming training and certification opportunities.

Can you quantify the financial impact of your maintenance program on your business? Do you take into account not only the direct costs of maintaining equipment, such as labour and parts, but also the costs of not maintaining equipment effectively, such as unplanned downtime, equipment failures and production losses?

The total financial impact of maintenance can be difficult to measure, yet it is a very valuable task to undertake. It is the first step in finding ways to improve profit and loss. In other words, it is the first step towards an optimised maintenance strategy.

In a 2001 study of maintenance costs for six open pit mines in Chile [1], maintenance costs were found to average 44% of mining costs. It’s a significant figure, and it highlights the direct relationship between maintenance and the financial performance of mines. More recently, a 2013 Industry Mining Intelligence and Benchmarking study [2] reported that mining equipment productivity has decreased 18% since 2007; and it fell 5% in 2013 alone. Besides payload, operating time was a key factor.  

So how do you know if you are spending too much or too little on maintenance? Certainly, Industry Benchmarks provide a guide. In manufacturing best practice, benchmarks are less than 10% of the total manufacturing costs, or less than 3% of asset replacement value [3].

While these benchmarks may be useful, a more effective way to answer the question is to look at the symptoms of over- or under-spending in maintenance. After all, benchmarks cannot take into account your unique history and circumstance.

Symptoms of under-spending on maintenance include:

  • Rising ‘hidden failure costs’ due to lost production
  • Safety or environmental risks and events
  • Equipment damage
  • Reputation damage
  • Waiting time for spares
  • Higher spares logistics cost
  • Lower labour utilisation
  • Delays to product shipments
  • Stockpile depletion or stock outs

Other symptoms are explored in more detail in our guide: 5 Symptoms Your Maintenance Strategy Needs Optimizing.

Man in front of computer screen

Figure 1

In most cases, it is these ‘hidden failure costs’ that have the most impact on your bottom line. These costs can be many times higher than the direct cost of maintenance – causing significant and unanticipated business disruption. As such, it is very important to find ways to measure the effects of not spending enough on maintaining equipment.

Various tools and software exist to help simulate the scenarios that can play out when equipment is damaged, fails or, conversely, is proactively maintained. A Failure Modes Effects and Criticality Analysis (FMECA) is a proven methodology for evaluating all the likely failure modes for a piece of equipment, along with the consequences of those failure modes.

Extending the FMECA to Reliability Centred Maintenance (RCM) provides guidance on the optimum choice of maintenance task. Combining RCM with a simulation engine allows rapid feedback on the worth of maintenance and the financial impact of not performing maintenance.

Armed with the information gathered in these analyses, you will gain a clear picture of the optimum costs of maintenance for particular equipment – and can use the data to test different ways to reduce costs. It may be that there are redundant maintenance plans that can be removed; or a maintenance schedule that can become more efficient and effective; or opportunity costs associated with a particular turnaround frequency and duration. Perhaps it is more beneficial to replace equipment rather than continue to maintain it.

It’s all about optimising plant performance for peak production; while minimising the risk of failure for key pieces of equipment. Get it right, and overall business costs will fall.

Want to read on? Download our guide: 5 Symptoms Your Maintenance Strategy Needs Optimizing.


[1] Knights, P.F. and Oyanander, P (2005, Jun) “Best-in-class maintenance benchmarks in Chilean open pit mines”, The CIM Bulletin, p 93

[2] PwC (2013, Dec) “PwC’s Mining Intelligence and Benchmarking, Service Overview”,


Figure 1:  This image shows Isograph’s RCMCostTM software module which is part of their Availability WorkbenchTM. Availability Workbench, Reliability Workbench, FaultTree+, Hazop+ and NAP are registered trademarks of Isograph Software. ARMS Reliability are authorized distributors, trainers and implementors.

While there are three main reasons organizations typically perform Root Cause Analysis (RCA) following an issue with their asset or equipment, there are a whole host of other indicators that RCA should be performed.Cartoon_Man/HardHat

Odds are, you’re recording a lot of valuable information about the performance of your equipment – information that could reveal opportunities to perform an RCA, find causes, and implement solutions that will solve recurring problems and improve operations. But are you using your recorded information to this extent?

First, let’s quickly talk about three reasons why RCA is typically performed:

1. Because you have to

There may be a regulatory requirement to demonstrate that you are doing something about a problem that’s occurred.

2. You have breached a trigger point

Your own company has identified the triggers for significant incidents that warrant root cause analysis.

3. Because you want to

An opportunity has presented itself to make changes for the better. Or perhaps you’ve decided you simply don’t want to lose so much money all the time.

At the core of all industry is the desire to make money. Anything that negatively impacts this goal is usually attacked by performing root cause analysis.Oil And Gas Pipelines

I was having a conversation with a reliability engineer at an oil and gas site, and I asked him what lost opportunity or downtime might cost that company over the course of a year. He said it was in the vicinity of three quarters of a billion dollars – $750,000,000. Is this a good enough reason to perform root cause analysis? Even a 10% change would have a huge impact on bottom line figures.

The monetary impact to the business was of course not due to any single event, but to a multitude of events both large and small.

Each event presents itself as an opportunity to learn and to make any changes necessary to prevent its reoccurrence. Once can be written off as happenstance… things happen, serious or minor, and that’s life. But to let it happen continuously means that something is seriously wrong.

While these are all valid reasons to perform an RCA, there are at least ten more tell-tale equipment-related clues that an RCA needs to happen – most of which can be identified through the information you’re probably already recording.

Here are ten tell-tale signs that your organisation needs to perform Root Cause Analysis:

  1. Increased downtime to plant, equipment or process.
  2. Increase in recurring failures.
  3. Increase in overtime due to unplanned failures.
  4. Increase in the number of trigger events.
  5. Less availability of equipment.
  6. High level of reactive maintenance.
  7. Lack of time… simply can’t do everything that needs doing.
  8. Increase in the number of serious events… nearing the top of the pyramid.
  9. Longer planned “shut” durations.
  10. More frequent “shut” requirement.

These indicators imply that we need to be doing more in the realm of root cause analysis before these issues snowball.

If you can identify with some of these pain points, download our eBook “11 Problems With Your RCA Process and How to Fix Them” in which we provide best practice advice on using RCA to help eliminate some of these problems.