Author Archives: Admin

A true Asset Strategy Management program delivers predictable outcomes and avoids unexpected failures, outages, safety exposures and costs. ???????????????????????????????????????????????????

Poor reliability of equipment and processes can have sudden and disastrous effects on the ability of an organisation to deliver operational or project objectives. Reliability problems can lead to unexpected downtime, poor quality product or service, missed operational targets, significant remedial costs, poor safety and a rise in incidents.

Managing reliability well seems elusive to most organisations who find it difficult to connect reliability strategy to maintenance execution. In many organisations, the tendency is to focus on maintenance execution alone, in the belief that plant reliability will improve. In order to improve execution, focus is placed on the work management process, work management KPIs, and Master Data. The reality is that even world class execution of a poor strategy won’t deliver on operational objectives in a predictable consistent way. Many organisations are executing inconsistent or sub optimal strategies, leading to variable results, continued under-performance, and significant failures and outages.

Institutionalising Asset Strategy Management (ASM) into the operation reduces failures, downtime and risk, and as a consequence, total cost of operations are lower. Deploying the optimal strategy across all assets and monitoring performance provides the means to improve reliability across all assets, and to sustain the improved performance over time, and throughout periods of change.

ASM removes the inconsistent outcomes from asset strategies, allows for any pockets of excellence to be deployed to all relevant assets, and drives continuous reliability improvement.

What’s the roadblock?

Enterprise Resource Planning (ERP) systems are designed to execute strategy, they are not designed to develop, maintain and manage good strategy.  Many organisations have not yet realised that strategy is separate to execution. ERP systems are designed to support efficient execution, and in order to be effective, have to be continually populated with appropriate Master Data and optimal strategies.

What’s the Solution?

An ASM solution acts as the thread across all systems. It allows organisations to capture and review data from all sources and leverage learnings to enhance reliability strategies by identifying the pockets of strategy excellence and deploying those strategies across the organisation wherever they are relevant.

Standardising and leveraging good strategy

At the core of an ASM solution sits an asset strategy library which houses reliability-based tactics. These asset strategies can be deployed rapidly and support regional or local variations to cater for different operational or environmental conditions. Strategy variation is visible organisation-wide via a reporting functionality, where all learnings drive continual improvement in the asset strategy library, which can be accessed and redeployed to any asset.

Achieve better benchmarking

Operation executives are held accountable for performance, but they don’t have access to all of the data and knowledge they need in order to make accurate decisions. In numerous multi-site organisations, reliability strategies are not standardised across all sites, adding to the confusion between data, strategies, and outcomes. These factors make it difficult to benchmark and thus compare costs and performance of like equipment across the organisation. An ASM solution captures data from many sources and presents it in one place. It allows managers to set up benchmarks, develop and deploy the best strategies consistently, monitor KPIs and align strategies across their whole operation.

Gain control over execution

Asset Managers often have no control over the deployment and execution of the strategies they develop. An ASM solution gives managers the ability to ensure that standardised procedures for strategies are deployed to all assets, at all sites, and to make certain that any modifications to procedures go through an approval process first. In addition, managers gain the ability to monitor the effectiveness of all strategies and to identify system wide and specific enhancements that should be made.


When changeover takes place among maintenance, reliability and project engineering personnel, quality and consistency issues can arise. It’s critical that standardisation is maintained over the longer term regardless of personnel changes, such that baseline strategies are deployed and monitored according to standards and quality assurance rules. To ensure strategies remain optimum over the asset life, the rationale for each strategy decision is maintained and can be revised, improved or changed as business needs change.

Rapid integration

Time is money. The sooner a reliability strategy can be developed and deployed, the better. An ASM system integrates with an organisation’s existing ERP system for easy, efficient and rapid deployment.

Case in Point

Major LNG operator develops & deploys strategies in only 44 days

The Goal

  • To develop maintenance strategies for a major LNG brownfield operation.

The Situation

  • Had no clear method to develop and standardise maintenance strategies in a rapid and efficient manner for all brownfield assets.
  • Many existing PMs were outdated.
  • Many assets did not have strategies.

How an ASM solution was leveraged

  • Generic maintenance strategies were developed for 122 unique equipment types.
  • Variations were made on generic strategies where applicable to meet asset operating context.
  • Strategies were then uploaded to SAP and deployed to 3,631 assets.


  • Client now has a single, standardised database in which strategies can be quickly updated and uploaded to SAP.
  • Entire process was completed in 44 days vs. 90+ days if a traditional method had been used.
  • Client is going to leverage the strategies and learning from this project to assist with the rollout of an upcoming Greenfield initiative.

Would you like to know how you can leverage your organisation’s pockets of excellence and build a best in class asset strategy management program? OnePM® is an innovative reliability strategy management solution, created by ARMS Reliability. LEARN MORE

OnePM® is a trade mark of ARMS Reliability and registered in Australia.

Author: Jason Ballentine

Developing a maintenance strategy requires careful consideration and due process. Yet from what I’ve seen, many organizations are making obvious errors right from the start — missteps that can torpedo the success of the strategies they’re trying so hard to put in place.??????????????????????????????????????????

Without further ado, here are five common maintenance strategy mistakes:

  1. Relying solely on original equipment manufacturer (OEM) or vendor recommendations.

It seems like a good idea — you’d think the people who made or sold the equipment would know best. It’s what they don’t know that can hurt you.

Outside parties don’t know how a piece of equipment functions at your facility. They don’t understand how much this equipment is needed, the cost of failure, whether there’s any redundancy within the system… OEM and vendor maintenance guidelines are geared to maximize the availability and reliability of the machine, but their strategies might not be appropriate for your unique circumstances or needs. As a result, your team could end up over-maintaining the equipment, which can actually create more problems than it solves. The more you mess with a piece of equipment, the more you introduce the possibility of error or failure. Some things, in some situations, are better left alone.

What’s more, OEMs and vendors have a vested interest in selling more spare parts (so they can make more money). That means that their replacement windows might not be accurate or appropriate to your business needs. Rather than relying on calendar-driven replacement, your maintenance strategy might focus more on inspecting the equipment to proactively identify any issues or deterioration, then repairing or replacing only as needed.

It’s fine to use OEM/vendor maintenance guidelines as a starting point. Just make sure you thoroughly review their recommendations to see if they align with your unique needs for the given piece of equipment. Don’t just blindly accept them — make sure they fit first.

  1. Relying heavily on generic task libraries for your maintenance strategy.

This is surprisingly common. Some organizations purchase a very generic set of activities for a piece of equipment or equipment category, and attempt to use them to drive maintenance strategy. But generic libraries are even worse than OEM/vendor recommendations because they are just that — generic. They aren’t written for the specific equipment make and model you have. They might even include tasks that simply don’t apply, such as “inspect the belt” on a pump that uses an entirely different drive mechanism. Once a mechanic attempts to perform one of these generic, ill-suited tasks, he or she stops trusting your overall maintenance strategy. Without credibility and compliance, you might as well not have a strategy at all.

Like OEM and vendor recommendations, generic task libraries can help you get started on a robust maintenance strategy, if (and only if) you carefully examine them first and only use the tasks that make sense for your particular equipment and operational needs.

  1. Failing to include a criticality assessment in your strategy decisions.

If you choose and define tasks without factoring in criticality, you run the risk of wasted effort and faulty maintenance. Think about it: If a piece of equipment is low on the criticality scale, you might be okay to accept a generic strategy and be done with it. But for equipment that’s highly critical to the success of your operations, you need to capture as much detail as possible when selecting and defining tasks. How can you know which is which without fully assessing the relative importance of each piece of equipment (or group of equipment) to the overall performance of your site?

  1. Developing maintenance strategies in a vacuum.

Sometimes, organizations will hire an outside consultant to develop maintenance strategies and send them off to do it, with no input from or connection with the maintenance team (or the broader parts of the organization). Perhaps they figure, “you’re the expert, you figure it out.” Here’s the problem: For a maintenance strategy to be successful, it must be developed within the big picture. You’ve got to talk to the mechanic who’ll be doing the work, the planner for that work, and the reliability engineer who’ll be responsible for the performance of that equipment, production, or operation. Their input is extremely valuable, and their buy-in is absolutely critical. Without it, even the best maintenance strategy can be met with resistance and non-compliance.

  1. Thinking of maintenance strategy development as a “one-and-done” effort.

For some organizations, the process of developing a maintenance strategy from the ground up seems like something you do once and just move on. But things change — your business needs change, the equipment you have on site changes, personnel changes, and much more. That’s why it’s vitally important to keep your maintenance strategies aligned with the current state of your operations.

In fact, a good maintenance strategy is built with the idea of future revisions in mind. That means the strategy includes clear-cut plans for revisiting and optimizing the strategy periodically. A good strategy is also designed to make those revisions as easy as possible by capturing all of the knowledge that went into your strategy decisions. Don’t just use Microsoft Word or put tasks directly into the system without documenting the basis for the decisions you made. What were your considerations? How did you evaluate them? What ultimately swayed your decision? In the future, if the key factors or circumstances change, you’ll be able to evaluate those decisions more clearly, without having to guess or rely on shaky recall.

If you’ve found yourself making any of these mistakes, don’t despair. Most errors and missteps can be addressed with an optimization project. In fact, ARMS Reliability specializes in helping organizations make the most of their maintenance strategies. Contact us to learn more.

Engineering Support_Web Banner

Author: Jason Ballentine

Many organizations believe that making sound maintenance decisions requires a whole lot of data. It’s a logical assumption — you do need to know things like the number of times an event has occurred, its duration, the number of spare parts needed, and the number of people engaged in addressing the event; plus the impact on the business and the reason why it happened. ????????????????????????????????????????

A lot of this information is captured in your Computerized Maintenance Management System (CMMS). The more detail you have, the more accurate results you can get from maintenance scenario simulation tools like Isograph’s Availability Workbench™. Unfortunately, your CMMS data may be lacking enough detail to yield optimal results.

It’s enough to make anybody want to throw his or her hands up and put off the decision indefinitely. If you do, you could be making a big mistake.

No matter what, you’re still going to have to make a decision. You have to.

The truth is, you can still do a lot with limited or poor quality data, supported by additional sources of knowledge. Extract any and all information you have available, not just what is in the CMMS. Document what you’ve got, then use it to make a timely decision that’s as informed as possible.

Don’t get caught up in the fact that it’s not perfect data — circumstances in the real world are hardly ever ideal. In fact, as reliability engineers, most of the data we get is related to failure, which is exactly what we’re trying to avoid. Actually, if we are tracking failures, having less data means we are likely doing our jobs well because that means we are experiencing a low number of failures.

The bottom line is: we can’t afford to sit and wait for more data to make decisions, and neither can you.

Gather as much information as you can from all available sources:


In an ideal world, this is the master data record of all activities performed.  As discussed previously, that is almost never the case; however, this is an important starting point to reveal where data gaps exist.

Personal experience and expertise

There’s a wealth of information stored within the experience of people who are familiar with any given piece of equipment. Consider holding a facilitated workshop to gather insight on the equipment’s likely performance. Even a series of informal conversations can yield useful opinions and real-world experiences.

The Original Equipment Manufacturer (OEM)

Most OEMs will have documentation you can access, possibly also a user forum you can mine for additional information.

Industry databasese.g., the Offshore and Onshore Reliability Data Handbook (OREDA) and Process Equipment Reliability Database (PERD) by Center for Chemical Process Safety (CCPS)

Some information is available in these databases, but it’s generic — not specific to your unique site or operating context. For example, you can find out how often a certain type of pump fails, but you can’t discover whether that pump is being used on an oil platform, refinery, power station or mine site. Industry data does, however, provide useful estimates on which you can base your calculations and test your assumptions.

Capture all these insights in an easily accessible way, then use what you’ve learned to make the best decision currently possible. And be sure to record the basis for your decision for future reference. If you get better data down the road, you can always go back and revise your decisions — after all, most maintenance strategies should remain dynamic by design.

Don’t let a lack of data paralyze you into inaction. Gather what you can, make a decision, see how it works, and repeat. It’s a process of continuous improvement, which given the right framework is simple and efficient.

Availability Workbench™, Reliability Workbench™, FaultTree+™, and Hazop+™ are trademarks of Isograph Limited the author and owner of products bearing these marks. ARMS Reliability is an authorized distributor of those products, and a trainer in respect of their use.

Reliability Summit_Blog_Web Banner

Author: Jason Ballentine

As with any budget, you’ve only got a certain amount of money to spend on maintenance in the coming year. How do you make better decisions so you can spend that budget wisely and get maximum performance out of your facility? ??????????????????????????????????????????????

It is possible to be strategic about allocating funds if you understand the relative risk and value of different approaches. As a result, you can get more bang for the same bucks.

How can you make better budget decisions?

It can be tempting to just “go with your gut” on these things. However, by taking a systematic approach to budget allocation, you’ll make smarter decisions — and more importantly you’ll have concrete rationales for why you made those decisions —  which can be improved over time. Work to identify the specific pieces of equipment (or types of equipment) that are most critical to your business, then compare the costs and risks of letting that equipment run to failure against the costs and risks of performing proactive maintenance on that equipment. Let’s take a closer look at how you can do that.

4 steps to maximize your maintenance budget

1.  Assign a criticality level for each piece of equipment. Generally, this is going to result in a list of equipment that would cause the most pain — be it financial, production loss, safety, or environmental pain — in the event of failure. Perform a Pareto analysis for maximum detail. 

2.  For your most critical equipment, calculate the ramifications of a reactive/run-to-failure approach.

  • Quantify the relative risk of failure. (You can use the RCMCost™ module of Isograph’s Availability Workbench™ to better understand the risk of different failure modes.)
  • Quantify the costs of failure. Keep in mind that equipment failures can affect multiple aspects of your business in different ways — not just direct hard costs. In every case, consider all possible negative effects, including potential risks.
    • Maintenance: Staff utilization, spare parts logistics, equipment damage, etc.
    • Production Impact: Downtime, shipment delays, stock depletion or out-of-stock, rejected/reworked product, etc.
    • Environmental Health & Safety (EHS) Impact: Injuries, actual/potential releases to the environment, EPA visits/fines, etc.
    • Business Impact: Lost revenue, brand damage, regulatory issues, etc.

For a more detailed explanation of the various potential costs of failure, consult our eBook, Building a Business Case for Maintenance Strategy Optimization.

3.  Next, calculate the impact of a proactive maintenance approach for this equipment

  • Outline the tasks that would best mitigate existing and potential failure modes
  • Evaluate the cost of performing those tasks, based on the staff time and resources required to complete them.
  • Specify any risks associated with the proactive maintenance tasks. These risks could include the possibility of equipment damage during the maintenance task, induced failures, and/or infant mortality for newly replaced or reinstalled parts.

4. Compare the relative risk costs between these approaches for each maintenance activity. This will show you where to focus your maintenance budget for maximum return.

When is proactive maintenance not the best plan?

For the most part, you’ll want to allocate more of your budget towards proactive maintenance for equipment that has the highest risk and the greatest potential negative impact in the event of failure. Proactive work is more efficient so your team can get more done for the same dollar value. Letting an item run to failure can create an “all hands on deck” scenario under which nothing else gets done, whereas many proactive tasks can be performed quickly and possibly even concurrently.

That said, it’s absolutely true that sometimes run-to-failure is the most appropriate approach for even a critical piece of equipment. For example, a maintenance team might have a scheduled task to replace a component after five years, but the problem is that component doesn’t really age -— the only known failure mode is getting struck by lightning. No matter how old that component is, the risk is the same. Performing replacement maintenance on this type of component might actually cost more than simply letting it run until it fails. (In these cases, a proactive strategy would focus on minimizing the impact of a failure event by adding redundancy or stocking spares.) But you can’t know that without quantifying the probability and cost of failure.

Side note: Performing this analysis can help you see where your maintenance budget could be reduced without a dramatic negative effect on performance or availability. Alternatively, this analysis can help you demonstrate the likely impact of a forced budget reduction. This can be very helpful in the event of budget pressure coming down from above.   

At ARMS Reliability, we help organizations understand how to forecast, justify and prioritize their maintenance budgets for the best possible chances of success. Contact us to learn more.

Availability Workbench™, Reliability Workbench™, FaultTree+™, and Hazop+™ are trademarks of Isograph Limited the author and owner of products bearing these marks. ARMS Reliability is an authorised distributor of those products, and a trainer in respect of their use.

Author: Scott Gloyna

For any given asset there are typically dozens of different predictive or preventive maintenance tasks that could be performed, however selecting the right maintenance tasks that contribute effectively to your overall strategy can be tricky, The benefit is the difference between meeting production targets and the alternative of lost revenue, late night callouts, and added stress from unplanned downtime events. Construction Worker Pointing With Finger. Ready For Sample Text

Step 1: Build out your FMEA (Failure Mode Effects Analysis) for the asset under consideration. 

Make sure you get down to appropriate failure modes in enough detail so that the causes are understood and you can identify the proper maintenance to address each specific failure mode.

Once you’ve made a list of failure modes, then it’s detailed analysis time. If you want to be truly rigorous, perform the following analysis for every potential failure mode. Depending on the criticality of the asset you can simplify by paring down your list to include only the failure modes that are most frequent or result in significant downtime.

Step 2: Identify the consequences of each failure mode on your list.

Failure modes can result in multiple types of negative impact. Typically, these failure effects include production costs, safety risks, and environmental impacts. It is your job to identify the effects of each failure mode and quantify them in a manner that allows them to be reviewed against your business’s goals. Often when I am facilitating a maintenance optimization study people will say things like “There is no effect when that piece of equipment fails.” If that’s the case, why is that equipment there? All failures have effects, they may just be small or hard to quantify, perhaps because of available workarounds or maybe there is a certain amount of time after the failure before an effect is realized.

Step 3: Understand the failure rate for each particular mode.

Gather information on the failure rates from any available industry data and personnel with experience on the asset or a similar asset and installation, as well as any records of past failure events at your facility. This data can be used to evaluate the frequency of failure through a variety of methods — ranging from a simple Mean Time To Failure (MTTF) to a more in-depth review utilizing Weibull distributions.

(Note: The Weibull module of Isograph’s Availability Workbench™ can help you to quickly and easily understand the likelihood of different failure modes occurring.)

Step 4: Make a list of possible reactive, planned or inspection tasks to address each failure mode.

Usually, you start by listing the actions you take when that failure mode occurs (reactive maintenance). Then broaden your list to any potential preventive maintenance and/or inspection tasks that could help prevent the failure mode from happening, or reduce the frequency at which it occurs.

  • Reactive tasks
    • Replacement
    • Repair
  • Preventive tasks
    • Daily routines (clean, adjust, lubricate)
    • Periodic overhauls, refurbishments, etc.
    • Planned replacement
  • Inspection tasks
    • Manual (sight, sound, touch)
    • Condition monitoring (vibration, thermography, ultrasonics, x-ray and gamma ray)

Step 5: Gather details about each potential task.

In order to compare and contrast different tasks, you have to understand the requirements of each:

  • What exactly does the task entail? (basic description)
  • How long would the work take?
  • How long would it take to start the work after shutdown/failure?
  • Who would do the work?
  • What labor costs are involved? (the hourly rates of the employees or outside contractors who would perform the task)
  • Would any spare parts be required? If so, how much would they cost?
  • Would you need to rent any specialized equipment? If so, how much would it cost?
  • Do you have to take the equipment offline? If so, for how long?
  • How often would you need to perform this task (frequency)?

A key consideration for inspection tasks only: What is the P-F interval for this failure mode? This is the window between the time you can detect a potential failure (P) and when it actually fails (F) — similar to calculating how long you can drive your car after the fuel light comes on, before you actually run out of fuel Understanding the P-F interval is key in determining the interval for each inspection task.

The P-F interval can vary from hours to years and is specific to the type of inspection, the specific failure mode and even the operating context of the machinery.

It can be hard to determine the P-F interval precisely but it is very important to ensure that the best approximation is made because of the impact it has on task selection and frequency.

Step 6: Evaluate the lifetime costs of different maintenance approaches.

Once you understand the cost and frequency of different failure modes, as well as the cost and frequency of various maintenance tasks to address them, you can model the overall lifetime costs of various options.

For example, say you have a failure mode with a moderate business impact — enough to affect production, but not nosedive your profits for the quarter. If that failure mode has a mean time between failures (MTBF) of six months, you might take a very aggressive maintenance approach. On the other hand, if that failure mode only happens once every ten years, your approach would be very different. “Run to Failure” is often a completely legitimate choice, but you need to understand and be able to justify that choice.

These calculations can be done manually, in spreadsheets or using specialized modeling software such as the RCMCost™ module of Isographs Availability Workbench™.

Ultimately you try to choose the least expensive maintenance task that provides the best overall business outcome.

 Ready to learn more? Gain the skills needed to develop optimized maintenance strategies through our training course: Introduction to Maintenance Strategy Development


Author: Dan DeGrendel

Regardless of industry or discipline, we can probably all agree that routine maintenance — sometimes referred to as preventative, predictive, or even scheduled maintenance — is a good thing. Unfortunately, through the years I’ve found that most companies don’t have the robust strategies they need.

Typical issues and the kinds of trouble they can create:

service engineer worker at industrial compressor refrigeration s1. Lack of structure and schedule

In many cases, routine tasks are just entries on a to-do list of work that needs to be performed — with nothing within the work pack to drive compliance. In particular, a list of tasks beginning with “Check” which have no guidance of an acceptable limit can have limited value. The result can be a “tick and flick” style routine maintenance program that fails to identify impending failure warning conditions.

2. Similar assets, similar duty, different strategies

Oftentimes, maintenance views each piece of equipment as a standalone object, with its own unique maintenance strategy. As a result, one organization could have dozens of maintenance strategies to manage, eating up time and resources. In extreme cases, this can lead to similar assets having completely different recorded failure mechanisms and routine tasks, worded differently, grouped differently and structured differently within the CMMS.

3. Operational focus 

Operations might be reluctant to take equipment out of service for maintenance, so they delay or even cancel the appropriate scheduled maintenance. At times this decision is driven by the thought that the repair activity is the same in a planned or reactive manner. But experience tells us that without maintenance, the risk is even longer downtime and more expensive repairs when something fails.

4. Reactive routines

Sometimes, when an organization has been burned in the past by a preventable failure, they overcompensate by performing maintenance tasks more often than necessary. The problem is, the team might be wasting time doing unnecessary work — worse still it might even increase the likelihood of future problems, simply because unnecessary intrusive maintenance can increase the risk of failure.

5. Over-reliance on past experience 

There’s no substitute for direct experience and expertise. But when tasks and frequencies are too solely based on opinions and “what we’ve always done” — rather than sound assumptions — maintenance teams can run into trouble through either over or under maintaining. Without documented assumptions, business decisions are based on little more than a hunch. “Doing what we’ve always done” might not be the right approach for the current equipment, with the current duty, in the current business environment (and it certainly makes future review difficult).

6. Failure to address infrequent but high consequence failures 

Naturally, routine tasks account for the most common failure modes. They should however also address failures that happen less frequently, but may have a significant impact on the business. Developing a maintenance plan which addresses both types, prevents unnecessary risk. For example, a bearing may be set up on a lubrication schedule, but if there’s no plan to detect performance degradations due to a lubrication deficiency, misalignment, material defect, etc then undetected high consequence failures can occur.

7. Inadequate task instructions

Developing maintenance guidelines and best practices takes time and effort. Yet, all too often, the maintenance organization fails to capture all that hard-won knowledge by creating clear, detailed instructions. Instead, they fall back on the maintenance person’s knowledge — only to lose it when a person leaves the team. Over time, incomplete instructions can lead to poorly executed, “bandaid-style” tasks that get worse as the months go by.

8. Assuming new equipment will operate without failure for a period of time

There’s a unique situation that often occurs when new equipment is brought online. Maintenance teams assume they have to operate the new equipment first to see how it fails before they can identify and create the appropriate maintenance tasks. It’s easy to overlook the fact that they likely have similar equipment with similar points of failure. Their data from related equipment provides a basic foundation for constructing effective routine maintenance.

9. Missing opportunity to improve

If completed tasks aren’t reviewed regularly to gather feedback on instructions, tools needed, spare parts needed, and frequency; the maintenance process never gets better. The quality or effectiveness of the tasks then degrade over time and, with it, so does the equipment.

10. Doing what we can and not what we should 

Too often, maintenance teams decide which tasks to perform based on their present skill sets — rather than equipment requirements. Technical competency gaps can be addressed with a training plan and/or new hires, as necessary, but the tasks should be driven by what the equipment needs.

Without a robust routine maintenance plan, you’re nearly always in reactive mode — conducting ad-hoc maintenance that takes more time, uses more resources, and could incur more downtime than simply taking care of things more proactively. What’s worse, it’s a vicious cycle. The more time maintenance personnel spend fighting fires, the more their morale, productivity, and budget erodes. The less effective routine work that is performed, the more equipment uptime and business profitability suffer.  At a certain point, it takes a herculean effort simply to regain stability and prevent further performance declines.

Here’s the good news: An optimized maintenance strategy, constructed with the right structure is simpler and easier to sustain. By fine-tuning your approach, you make sure your team is executing the right number and type of maintenance tasks, at the right intervals, in the right way, using an appropriate amount of resources and spare parts. And with a framework for continuous improvement, you can ultimately drive towards higher reliability, availability and more efficient use of your production equipment.

Want to learn more? Check out our next blog in this series, Plans Can Always Be Improved:  Top 5 Reasons to Optimize Your Maintenance Strategy.


Author: Dan DeGrendel

Maintenance optimization doesn’t have to be time-consuming or difficult. Really it doesn’t. Yet many organizations simply can’t get their maintenance teams out of a reactive “firefighting mode” so they can focus on improving their overall maintenance strategy. Development And Growth

Stepping back to evaluate and optimize does take time and resources, which is why some organizations struggle to justify the project. They lack the data and/or the framework to demonstrate the real, concrete business value that can be gained.

And even when organizations do start to work on optimization, sometimes their efforts stall when priorities shift, results are not immediate and the overall objectives fade from sight.

If any of these challenges sound familiar, there are some very convincing reasons to forge ahead with maintenance optimization:

1. You can make sure every maintenance task adds value to the business

Through the optimization process, you can eliminate redundant and unnecessary maintenance activities, and make sure your team is focused on what’s really important. You’ll outline the proper maintenance tasks, schedules and personnel assignments; then incorporate everything into the overall equipment utilization schedule and departmental plans to help drive compliance. Over time, an optimized maintenance strategy will save time and resources — including reducing the hidden costs of insufficient maintenance (production downtime, scrap product, risks to personnel or equipment and expediting and warehousing of spare parts, etc.).

2. You’ll be able to plan better

Through the optimization process, you’ll be allocating resources to various tasks and scheduling them throughout the year. This gives you the ability to forecast resource needs, by trade, along with spare parts and outside services. It also helps you create plans for training and personnel development based on concrete needs.

3. You’ll have a solid framework for a realistic maintenance budget

The plans you establish through the optimization process give you a real-world outline of what’s needed in your maintenance department, why it’s needed, and how it will impact your organization. You can use this framework to establish a realistic budget with strong supporting rationales to help you get it approved. Any challenges to the budget can be assessed and a response prepared to indicate the impact on performance that any changes might make.

4. You’ll just keep improving

Optimization is a project that turns into an ongoing cycle of performing tasks, collecting feedback and data, reviewing performance, and tweaking maintenance strategies based on current performance and business drivers.

5. You’ll help the whole business be more productive and profitable

Better maintenance strategies keep your production equipment aligned to performance requirements, with fewer interruptions. That means people can get more done, more of the time. That’s the whole point, isn’t it?

Hopefully, this article has convinced you of the benefits of optimizing your maintenance strategies. Ready to get started or re-energize your maintenance optimization project? Check out our next blog article, How To Optimize Your Maintenance Strategy: A 1,000-Foot View.


Author: Dan DeGrendel

Optimizing your maintenance strategy doesn’t have to be a huge undertaking. The key is to follow core steps and best practices using a structured approach. If you’re struggling to improve your maintenance strategy — or just want to make sure you’ve checked all the boxes — here’s a 1000-foot view of the process.

1. Sync up

  • Identify key stakeholders from maintenance, engineering, production, and operations — plus the actual hands-on members of your optimization team.
  • Get everybody on board with the process and trained in the steps you’re planning to take.  A mix of short awareness sessions and detailed educations sessions to the right people are vital for success.
  • Make sure you fully understand how your optimized maintenance strategies will be loaded and executed from your Computer Maintenance Management System (CMMS)

2. Organize

  • Review/revise the site’s asset hierarchy for accuracy and completeness. Standardize the structure if possible.
  • Gather all relevant information for each piece of equipment.
    • Empirical data sources: CMMS, FMEA (Failure Mode and Effects Analysis) studies, industry standards, OEM recommended maintenance
    • Qualitative data sources: Team knowledge and past records

3. Prioritize

  • Assign a criticality level for each piece of equipment; align this to any existing risk management framework
  • Consider performing a Pareto analysis to identify equipment causing the most production downtime, highest maintenance costs, etc.
  • Determine the level of analysis to perform on each resulting criticality level

4. Strategize

  • Using the information you’ve gathered, define the failure modes, or apply an existing library template. Determine existing and potential modes for each piece of equipment
  • Assign tasks to mitigate the failure modes.
  • Assign resources to each task (e.g, the time, number of mechanics, tools, spare parts needed, etc.)
  • Compare various options to determine the most cost-effective strategy
  • Bundle selected activities to develop an ideal maintenance task schedule (considering shutdown opportunities). Use standard grouping rules if available.

This is your proposed new maintenance strategy.

5. Re-sync

  • Review the proposed maintenance strategy with the stakeholders you identified above, then get their buy-in and/or feedback (and adjust as needed)

6. Go!

  • Implement the approved maintenance strategy by loading all of the associated tasks into your CMMS — ideally through direct integration with your RCM simulation software, manually, or via Excel sheet loader.

7. Keep getting better

  • Continue to collect information from work orders and other empirical and qualitative data sources.
  • Periodically review maintenance tasks so you can make continual improvements.
  • Monitor equipment maintenance activity for unanticipated defects, new equipment and changing plant conditions. Update your maintenance strategy accordingly.
  • Build a library of maintenance strategies for your equipment.
  • Take what you’ve learned and the strategies and best practices you’ve developed and share them across the entire organization, wherever they are relevant.

Of course, this list provides only a very high-level view of the optimization process.

If you’re looking for support in optimizing your maintenance strategies, or want to understand how to drive ongoing optimization, ARMS Reliability is here to help.


Puede usted cuantificar el impacto financiero de su programa de mantenimiento en su negocio? Incluye en sus cálculos no solamente los costos directos de mantenimiento, como mano de obra y repuestos, sino que también los costos de no hacer mantenimiento efectivo en sus equipos, como tiempo de paradas no planeadas, fallas de equipos y pérdidas de producción? calculate profit

La tarea de medir el impacto financiero de mantenimiento puede ser difícil pero sin embargo es una tarea de gran valor. Es el primer paso para encontrar maneras de mejorar su ganancias, en otras palabras el primer paso hacia una estrategia de mantenimiento optimizada.

En un estudio de mantenimiento realizado en 6 minas abiertas en Chile [1], se encontró que los costos de mantenimiento se aproximan a 44% de los costos de operar la mina. Esta es una cifra significativa, y resalta la relación entre mantenimiento y el desempeño financiero de una mina. Más recientemente en 2013, un estudio comparativo de minería [2] reportó que la productividad de los equipos mineros ha descendido 18% desde 2007, perdiendo 5% tan solo en el 2013. Además de la carga el tiempo de operación es un factor clave.

Pero entonces como saber si se están gastando muchos o muy pocos recursos en mantenimiento? Ciertamente, comparaciones con la industria proveen una guía. Las mejores prácticas de manufactura indican que el costo de mantenimiento debe ser menor al 10% de los costos totales de manufactura o menos de 3% los costos de reemplazo del equipo.

Mientras estas comparaciones pueden ser útiles, una manera más efectiva de responder la pregunta es mirar los síntomas de gastar muy poco o demasiado en mantenimiento. Al cabo que, las comparaciones no tienen en cuenta su historia partículas, ni las circunstancias operativas.

Los síntomas de gastar muy poco en mantenimiento incluyen:

  • Incremento en ‘costos de falla ocultos’ debido a pérdidas de producción
  • Riesgos y eventos de seguridad y medio ambiente
  • Daño a equipos
  • Daño a la reputación
  • Tiempos de espera de repuestos
  • Costos alto de logística de repuestos
  • Menor utilización de mano de obra
  • Demoras en envío de productos
  • Agotamiento de stock

Otros síntomas son explorados con mayor detalle en nuestra guía; 5 síntomas que indican que su estrategia de mantenimiento requiere una optimización.

Man in front of computer screen

Figura 1

En la mayoría de los casos, son estos ‘costos de falla ocultos’ los que tienen el mayor impacto en el resultado final. Estos costos pueden ser varias veces más altos que el costo directo de mantenimiento causando paradas no anticipadas y significativas al negocio. Es por esto que es importante encontrar maneras para medir los efectos de no gastar lo suficiente en mantener los equipos.

Varias herramientas y software existen para ayudar a simular los escenarios que pueden ocurrir cuando un equipo se avería, falla o al contrario es mantenido de manera proactiva. Un análisis de modos de falla, efectos y criticidad (FMECA por sus siglas en inglés) es una metodología comprobada para evaluar todos los modos de falla probables para una pieza de equipo y sus consecuencias.

Extender un FMECA a Mantenimiento Centrado en Confiabilidad (RCM por sus siglas en inglés) provee una guía para escoger la tarea óptima de mantenimiento. Combinar RCM con un motor de simulación genera una respuesta veloz del valor de mantenimiento y el impacto financiero de no realizarlo.

Armado con información obtenida de estos análisis, usted obtendrá un dibujo claro de los costos óptimos de mantenimiento de un equipo en particular y puede usar esta data de diferentes maneras para reducir los costos de operación. Puede ser que existan planes de mantenimiento redundantes que pueden ser removidos, o un programa de mantenimiento que sea más eficiente y efectivo, o costos de oportunidad asociados a una frecuencia y duración de parada especifica. Quizás sea más beneficioso  reemplazar el equipo que continuar manteniéndolo.

La idea es optimizar el desempeño de la planta para obtener el máximo de producción, mientras que se minimiza los riesgos de falla de partes claves del equipo. Haga esto de manera correcta y los costos del negocio empezaran a descender.

Quiere seguir leyendo? Descargue nuestra guía: 5 síntomas que indican que su estrategia de mantenimiento requiere una optimización.

[1] Knights, P.F. and Oyanander, P (2005, Jun) “Best-in-class maintenance benchmarks in Chilean open pit mines”, The CIM Bulletin, p 93

[2] PwC (2013, Dec) “PwC’s Mining Intelligence and Benchmarking, Service Overview”,


Figura 1. En esta imagen se observa el módulo RCMCost™ de Isograph que es parte de su software Availability Workbench ™. Availability Workbench, Reliability Workbench, FaultTree+, Hazop+ y NAP son marcas registradas del software de Isograph. ARMS Reliability es ditribuidor autorizado, entrenador e implementador.


“Quanto tempo deve tomar um ACR?”

Esta questão é semelhante a quanto comprimento tem um pedaço de corda?

Ouvi um gerente de uma planta que tem estipulado um período máximo de duas horas para um ACR a ser realizado em sua organização. Outro espera, pelo menos, uma “tormenta de idéias” de soluções antes da conclusão do primeiro dia – dentro das 6 ou 7 horas. Não é incomum para um projecto de relatório a ser exigido dentro das 48 horas do iniciado o ACR.

As três dicas a seguir irão ajudá-lo a cumprir os prazos e expectativas definidos quando se tem tempo curto. Uma das vantagens do método Apollo Análise Causa Raiz é que é um processo rápido, mas requer um controlador eficaz para obter os resultados desejados, ou seja, soluções eficazes.

Dica # 1 Você Defina o Problema

Imagine a RCA foi desencadeado por um incidente não planejado ou evento que cai em qualquer um dos segurança, meio ambiente, produção, qualidade, falha de equipamento ou categorias semelhantes. Você tem sido apontado como o facilitador por um superior / gestor que está respondendo ao evento particular. Seu superior / gestor pode compreender o mecanismo de disparo e pode muito bem indicar o título do problema.

Por exemplo, “laceração do braço superior”, “derramamento de amônia”, “atraso de produção” e assim por diante poderia ser a oferta que você faz para a equipe como o ponto de partida para a análise. Normalmente, como facilitador você vai ter reunido alguns dos “fatos” de relatórios dos primeiros que respondem, entrevistas, folhas de dados, fotos e assim por diante. Assim, um bom primeiro passo é elaborar uma declaração definição do problema, incluindo a relevância refletida pelas conseqüências ou impactos. A equipe, então, tem um ponto de partida para começar a análise, ainda que a declaração do problema pode mudar à medida que mais detalhes sejam fornecidos.

Idealmente, você já terá criado um arquivo no RealityCharting™ ea tabela de Definição do Problema pode ser projetada em uma tela ou até mesmo na parede clara onde seu mapeamento será feito com as notas Post-It™. Informações dos membros da equipe deveriam ter sido introduzidas e podem ser confirmadas rapidamente neste display. Você pode até mostrar o formato do Relatório de Incidente e focar na opção Aviso de Isenção que você selecionou deliberadamente: Finalidade: Para evitar a recorrência, não colocar a culpa.

Este trabalho preparatório poderia salvar pelo menos 20 minutos de tempo dos membros da equipe e permitir um lançamento imediato para a fase de análise.

Importante: Salve-se horas de re-trabalho e embaraço potencial salvando o arquivo, assim que este primeiro processo esteja concluído, se você não tiver feito isso, e, posteriormente, em uma base regular. Manter alguma forma de controle de versão para que a evolução do quadro nos dias seguintes podem ser rastreados, se necessário.

Se você está particularmente, com bons recursos, o desenvolvimento gráfico pode ser gravado no software simultaneamente, como cópia dura é criada no espaço da parede. Um pequeno grupo pode optar por criar o gráfico diretamente através do software e um meio de projeção decente.

Dica # 2 Direcione o Análise

É fundamental que a sua iniciativa na elaboração da definição do problema não seja considerado pelos membros da equipe como desautorizando eles. A etapa de análise em que todos têm a oportunidade de contribuir deve garantir que eles sentem que têm a “propriedade” do problema.

Para reforçar isso, é aconselhável escolher uma sequência de abordar cada membro, normalmente da esquerda para a direita ou vice-versa, dependendo dos assentos. Isto estabelece a exigência de que uma pessoa esteja falando cada turno, por outro lado, que toda e qualquer declaração serão documentadas e em terceiro lugar, que cada pessoa tem a igualdade de oportunidades. A sua gravação rápida e exata de cada pedaço de informação irá fornecer a disciplina necessária para minimizar a conversa fiada que pode perder tempo porque distrai foco. Quando você tem uma série de “sem comentários” dos membros da equipe, porque o processo tenha esgotado o seu conhecimento imediato dos acontecimentos, inicie a criação do gráfico.

Vale a pena lembrar a equipe que cada item de informação que foi gravado e postado na área de estacionamento, pode não aparecer em sua forma original no gráfico ou não aparecer jamais, em alguns casos. Porque a recolha de informação é uma rede ampla para capturar o máximo de conhecimento sobre o que aconteceu, quando e porquê, não haverá foco particular. Mas porque eles são provenientes de pessoas com experiência e perícia ou conhecimento íntimo de eventos e circunstâncias, eles têm algum valor. O valor exacto irá ser determinada pelo ponto onde a informação senta-se na lógica causa e efeito que começa no problema e está ligada pelas relações “causado por”.

Importante: O texto da Causa deve ser escrito em LETRAS MAIÚSCULAS. Vai ser mais fácil de ler/decifrar para a equipe no momento e talvez a partir de fotografias do gráfico mais tarde. Da mesma forma usando maiúsculas no próprio software significa que a projeção do gráfico é mais eficaz e a impressão de várias vistas é reforçada.

Dica # 3 O “Como e Se” de Criar um Gráfico da Realidade

Muitos proponentes exploram o entendimento existente do evento, capturando tantas causas ação como seja possível. Estas podem chegar através de um processo de 5 PORQUÊS, por exemplo, que se inicia no Efeito Primário.

Planta Parou (Problema ou Efeito Primário)

Por quê? Bomba de Alimentação Não Bombeia

Por quê? Acoplamento Quebrado

Por quê? Rolamento do Motor Danificados

Por quê? Pista de Rolamento Colapsada

Por quê? Fadiga

O método Apollo RCA requer o uso da expressão “causado por?” Para conectar as relações causa e efeito. Compreender que deve haver pelo menos uma ação e uma condição ajuda a revelar as causas “escondidas” e, especialmente, as causas de condição que não vêm à mente inicialmente.

Para apoiar esta expressão e o essencial “porquê”, é aconselhável perguntar “como”. Isto pode ser utilizado inicialmente pelo membro mais imparcial de sua equipe que tem sido comprometido especificamente por causa de seu/sua falta de associação com o problema e pode sinceramente fazer as perguntas supostamente “tolas”. Invariavelmente estas perguntas geram mais causas ou um arranjo mais preciso das causas existentes. A pergunta “Como é que isso acontece exatamente?” Pode conduzir a equipe para tomar os “passos de bebê” necessários. Isso também muitas vezes expõe diferenças entre “especialistas” e a resolução destas diferenças é sempre esclarecedor.

O facilitador precisa estar ciente da necessidade de suavemente “desafiar” a compreensão da equipe assegurando ao mesmo tempo a aplicação de rigor suficiente para gerar a melhor representação de relações causais. Isso pode ser feito de uma maneira neutra, utilizando a proposição “SE”.

Dado que todo efeito requer pelo menos duas causas, então você pode lidar com a equipe com a proposição: “Se ‘umo existe’ e ‘três existe’ (duas condições), em seguida com ‘quatro acrescentado’ (a ação) será que o efeito é “oito” todas as vezes?”. Usando esta técnica em cada elemento causal irá gerar a clareza e segurança sendo procurada para compreender as causas do problema. Se cada “equação” (elemento causal) no gráfico é “real” e as próprias causas são “reais” (suportadas por provas), então a equipe está bem colocada para considerar os tipos de controles que ele poderia implementar para prevenir a recorrência da problema.

As mais causas que são reveladas mais oportunidades a equipe tem que identificar possíveis soluções.


Para acelerar o processo de ACR:

Passo 1 – Facilitador reúne informações sobre o evento e preenche a Declaração da Definição do Problema.

Passo 2 – Facilitador dirige a coleta de informações lançando uma ampla rede e solicita sistematicamente informações dos participantes.

Passo 3 – Use a informação recolhida para construir um RealityChart™ com ações com base no que aconteceu, então procure outras causas, como condições que podem ser inicialmente ocultas. Use Como e Se para ajudar a validar que as relações causais sejam lógicas.

Com um gráfico completado a etapa de achar soluções pode começar.

Nossa Curso Facilitadores Análise Causa Raiz (ACR) ensina os alunos a conduzir uma investigação com confiança e para encontrar soluções práticas para os seus problemas. Cursos de formação públicas oferecem nas principais cidades ao redor do mundo durante todo o ano. Saiba mais sobre as vantagens de participar de um curso de formação de público, ou consulte o nosso calendário de treinamento em todo o mundo para os próximos cursos e reserve online.