This is a guest post written by Copperleaf.  ARMS Reliability is an authorized distributor of Copperleaf’s C55 Asset Investment Planning & Management solution. 

Author: Stefan Sadnicki

Modern urban wastewater treatment plant. Close-up view

Anglian Water is an innovative company whose mission is “to put water at the heart of a whole new way of living” and raise awareness about how essential water is to life, to the environment, and to a vibrant and growing economy. The company is the largest water and water recycling company in England and Wales—and Copperleaf’s first client in this sector!

We recently completed the implementation of Copperleaf C55 and it was one of the most challenging, yet rewarding projects any of us have ever worked on. I sat down to catch up with Chris Royce, our primary stakeholder and project champion, to get his thoughts on how everything went. As Head of Strategic Investment Management for Anglian Water, Chris was involved with the project from before it was a project! As the implementation draws to a close, I’d like to share some of the highlights:

What was the most challenging part of the project?

For Copperleaf, this was a new country (UK) and a new sector (water). We could see the potential of the C55 system and the benefits it would provide, and in reality many utility assets are very similar and the principles of risk-based decision making are similar. Ultimately, we now have a fantastic solution that combines the power and capability of the core C55 solution with the maturity of the UK water sector. It’s really exciting to see. This continuous planning and management capability really puts us in a new space.

What was the most rewarding part?

For the procurement process, we put together our set of requirements, including many ambitious areas of functionality that we were going to need to meet future challenges. We were unsure if any suppliers could achieve them all, but we knew what best practice could look like—and the Copperleaf team committed to deliver them all. It’s been hugely rewarding to see the vision become reality throughout the project.

Is there anything unique that AW is doing with C55 – something that hasn’t been done before?

There are lots of things. In particular, we started capturing cost data in 2005 and have been carrying out cost estimation-linked investment planning since 2007, using over 1,800 cost models built up from that data. As such, it was very important for our new solution to be able to build on that library of knowledge. Working closely with our Cost Estimating Team, Copperleaf built out a new Cost Estimation module, integrated with the rest of C55, to execute our cost models within the planning process.

Have there been any other added benefits?

At the start of the process, we undertook a comprehensive process mapping of ‘as is’ and ‘to be’. This highlighted a number of pinch points in our process, which Copperleaf was able to ‘systemise’ as part of the implementation.

How has the C55 solution been received in the wider business?

We’ve had a great response from end users. As one user put it during a training session: “I’ve only been using C55 10 minutes and it’s already a significant improvement over our previous system.”

Any anecdotes from the project?

During evaluation, we held a number of reference calls and I joked that Copperleaf must have some magic stardust they put on their users’ keyboards, because I had never heard such positive references about an IT provider. I have to say they were honest! I believe it’s Copperleaf’s focus on the customer experience that made the difference.

What made the project a success?

It may be a cliché, but the joint Anglian Water and Copperleaf delivery team deserves a large amount of credit. We started from a strong position; we had a clear idea of what we wanted to achieve due to our maturity, and the right product to deliver it. But ultimately, the drive and dedication of the team is what has carried us to a successful go-live. Anglian Water is very strong in alliancing and is recognised as an industry leader in this regard, so I wanted to carry this through into this project. And on the Copperleaf side, just the simple thing of having one dedicated project manager for the duration of the project made all the difference in having a collaborative and innovative delivery approach.

To learn more about Copperleaf’s work with Anglian Water, click here.

About Stefan Sadnicki

Stefan is Managing Director for Copperleaf in Europe. He works both with Copperleaf partners and directly with asset-intensive organisations to solve their asset investment planning challenges. His background is in business analytics and consulting and he is an active member of The Institute of Asset Management (IAM). Connect with him on LinkedIn.


Our previous article in this RCA Program Development blog series introduced the six roles required in your root cause analysis team in order cover all of the necessary functions. This installment in the series goes into more depth on the team structure and provides some important considerations for deciding whom should be involved.  

Roles and Responsibilities

The RCA effort in any organization will require a number of participants at various levels of the organization having distinct roles with very specific responsibilities.  Once the workflow processes are established it is time to assign these roles and responsibilities to specific individuals, determine training needs, and decide if a formal human change management plan will be desirable to assist employees in assuming their new assignments.  These responsibilities must be clearly understood to ensure that the appropriate personnel are assigned to the various roles which include the following:

 

RCATeamGraphic_72ppi.png

 

To assist in the above, it is helpful to review the nature and dynamics of a functioning RCA team.

The role of an analysis team is to apply the RCA methodology to a particular incident.  A predetermined set of conditions or triggers is used in determining what incidents qualify for a formal RCA.  The RCA Champion in collaboration with a designated Facilitator then decides through the problem definition process how much effort is to be expended on the analysis i.e. what skills participants will be on the team and how many of each, based upon the severity of the incident.  As a general rule, the cost of the resources expended should be significantly less than the potential benefit of a successful outcome.  In other words, the effort should yield a favorable ROI or greatly reduce the risk of a safety or environmental incident recurrence.  It is often useful to define threshold levels, or break points for minor, significant, or major RCA efforts to provide guidance.

The Facilitator leads the analysis process.  Teams should be cross functional and consist of 6 to 8 members depending upon the nature, complexity, and severity of the incident under investigation. When assembling an RCA team, the Champion or Facilitator has the option of including people directly involved in the incident being analyzed on the team, or involve them as interviewees only.  It is often beneficial to include a team member with no firsthand knowledge of the incident, but having familiarity with the process, equipment, etc. in order to help bring objectivity to the group and avoid “group think”. These members need to be carefully selected to ensure they add value. Having people assigned to the analysis team because they are available will only waste time. The team should also have at its disposal all the facts, evidence, and a timeline surrounding the incident.

Certain points in the analysis may require additional information that is not available within the organization.  For example, it may be necessary to contact suppliers or vendors concerning design issues. The incident owner will need to understand and encourage the follow-up on this type of information gathering.

When the RCA program is launched, it is recommended to focus on the quality of the analysis and not quantity. A few quick-wins are important to building momentum and confidence in the process.

Specific RCA team responsibilities include the following:

  • Review the facts, evidence, and timeline surrounding the incident
  • Interview eye witnesses or others that may have useful knowledge of the incident at hand but are not on the analysis team
  • Interview subject matter experts
  • Provide the technical expertise for problem solving including action and condition recognition and possible solution proposals
  • Perform an RCA analysis. The Apollo Root Cause Analysis methodology includes:
  • Validating the problem definition as previously determined by the Champion and Facilitator
  • Creating the cause and effect diagram using RealityCharting® software
  • Identifying conditions and actions that precipitated the incident under investigation
  • Proposing effective solution to eliminate selected conditions and actions that are realistic and under the control of the facility’s management
  • Summarize the findings in a report to the Champion to be presented to the Steering Committee. Be available to answer any question the Steering Committee may have regarding the report.

Once the roles and responsibilities are assigned, a comprehensive training plan covering the RCA methodology and any identified change management needs must be developed with a targeted timeline.

So far, this blog series has covered:

The Key Steps of Designing Your Program

Defining Goals and Current Status

Setting KPIs and Establishing Trigger Thresholds

RCA and Solution Tracking and Roles and Responsibilities

Every year, millions of dollars needlessly go down the drain in large organisations. It’s money that can easily be saved, if you know why it’s disappearing and how to save it.

To illustrate, let’s look at a real-life example. Links

We are regularly asked to lead projects to review maintenance strategies sites and assets that are not meeting their availability targets, are suffering frequent unplanned failures, or high costs.

We typically set to work collecting the asset hierarchy, work order history and current maintenance plans. Using all this data, we apply sophisticated methodologies to build an optimised maintenance strategy. In a particular project the resulting revised strategies were forecast to reduce maintenance costs by -18% per annum, and improve availability by +3%.

It was a great outcome. But – and herein lies the problem – the site failed to effectively implement and execute the strategy, and so continued to suffer from unplanned failures and poor availability. There’s the money down the drain.

To truly realise the value good strategy needs to be implemented and then updated over time. In essence the strategy needs to be managed. This includes workflows, review and approval by appropriate subject matter experts, use of generic content wherever possible and data driven decision making.

Learning from past failures

Ten years ago, when Reliability Centred Maintenance (RCM) was really hitting its strides, more and more organisations started investing in the task of developing maintenance strategies. But according to research a massive 60 per cent of these strategies were never implemented. Think of the money wasted.

Or, if a strategy was implemented, it is likely that it may get changed over time with little or no oversight, typically the good strategy work is undone for it only to be put back to how it was.

Realistically any change to a strategy such as the interval, durations, specific tasks, and instruction content should be managed with a dedicated workflow which would include justification and the opportunity to utilise any great improvements across your entire asset base.

The power of combining Work Management with Strategy Management

To fix these endemic problems, the focus of an organisation needs to evolve to strategy management as well as work management.

 Think about it. Work management is all about executing tasks. Strategy management is all about deciding what tasks should be executed. You can have the best work execution process, but if you’re not working on the right strategies then it won’t deliver results. Asset Managers need to make sure that teams are effectively executing the right strategy.

ASM Graphic

Furthermore, reliability and maintenance teams need the agility to adapt if a positive change is made to a strategy at one site in a multi-site organisation, or a common asset used multiple times on a single site. How do you quickly deploy this cost-saving change across other sites in the organisation?

For example, think of a water utility that operates 400 pump stations across the country, with each one operating the same equipment. Say there’s a pump failure at one site, and a technician does some good root cause analysis work which leads to a recommended strategy around a task that needs to be done. If their decision could come back to a central area for review and approval, and then get deployed efficiently and electronically to all the other pump stations, the utility could potentially save thousands on future fixes, reduce risk and improve performance.

Wherever you find pockets of excellence you need to deploy them everywhere, effectively.

Adopt a best practice approach and create a culture of excellence

 The secret of successful strategy management lies in looking beyond the SAPs and Maximos of the world. You can try to standardise these systems for a “generate once, use many times” approach, but it won’t work. A CMMS is designed to manage work tasks, not manage strategy.

Instead, you need a separate approach and solution for strategy management, which directly integrates with your work management system. This way, if your reliability team and subject matter experts devise a new asset strategy that is going to save your organisation millions of dollars, then you can be assured that it will successfully be applied to all the relevant assets across all sites. Likewise, you will gain visibility into single site strategy excellence and be able to quickly and easily deploy it enterprise wide. With an Asset Strategy Management program, your asset strategies will be dynamic, constantly evolving and will instill a culture towards achieving excellence in reliability.

This makes reliability a reality.

Find out how you can adopt a best practice approach to Asset Strategy Management  and unlock unrealised value, enterprise wide.

A true Asset Strategy Management program delivers predictable outcomes and avoids unexpected failures, outages, safety exposures and costs. ???????????????????????????????????????????????????

Poor reliability of equipment and processes can have sudden and disastrous effects on the ability of an organisation to deliver operational or project objectives. Reliability problems can lead to unexpected downtime, poor quality product or service, missed operational targets, significant remedial costs, poor safety and a rise in incidents.

Managing reliability well seems elusive to most organisations who find it difficult to connect reliability strategy to maintenance execution. In many organisations, the tendency is to focus on maintenance execution alone, in the belief that plant reliability will improve. In order to improve execution, focus is placed on the work management process, work management KPIs, and Master Data. The reality is that even world class execution of a poor strategy won’t deliver on operational objectives in a predictable consistent way. Many organisations are executing inconsistent or sub optimal strategies, leading to variable results, continued under-performance, and significant failures and outages.

Institutionalising Asset Strategy Management (ASM) into the operation reduces failures, downtime and risk, and as a consequence, total cost of operations are lower. Deploying the optimal strategy across all assets and monitoring performance provides the means to improve reliability across all assets, and to sustain the improved performance over time, and throughout periods of change.

ASM removes the inconsistent outcomes from asset strategies, allows for any pockets of excellence to be deployed to all relevant assets, and drives continuous reliability improvement.

What’s the roadblock?

Enterprise Resource Planning (ERP) systems are designed to execute strategy, they are not designed to develop, maintain and manage good strategy.  Many organisations have not yet realised that strategy is separate to execution. ERP systems are designed to support efficient execution, and in order to be effective, have to be continually populated with appropriate Master Data and optimal strategies.

What’s the Solution?

An ASM solution acts as the thread across all systems. It allows organisations to capture and review data from all sources and leverage learnings to enhance reliability strategies by identifying the pockets of strategy excellence and deploying those strategies across the organisation wherever they are relevant.

Standardising and leveraging good strategy

At the core of an ASM solution sits an asset strategy library which houses reliability-based tactics. These asset strategies can be deployed rapidly and support regional or local variations to cater for different operational or environmental conditions. Strategy variation is visible organisation-wide via a reporting functionality, where all learnings drive continual improvement in the asset strategy library, which can be accessed and redeployed to any asset.

Achieve better benchmarking

Operation executives are held accountable for performance, but they don’t have access to all of the data and knowledge they need in order to make accurate decisions. In numerous multi-site organisations, reliability strategies are not standardised across all sites, adding to the confusion between data, strategies, and outcomes. These factors make it difficult to benchmark and thus compare costs and performance of like equipment across the organisation. An ASM solution captures data from many sources and presents it in one place. It allows managers to set up benchmarks, develop and deploy the best strategies consistently, monitor KPIs and align strategies across their whole operation.

Gain control over execution

Asset Managers often have no control over the deployment and execution of the strategies they develop. An ASM solution gives managers the ability to ensure that standardised procedures for strategies are deployed to all assets, at all sites, and to make certain that any modifications to procedures go through an approval process first. In addition, managers gain the ability to monitor the effectiveness of all strategies and to identify system wide and specific enhancements that should be made.

Future-proofing

When changeover takes place among maintenance, reliability and project engineering personnel, quality and consistency issues can arise. It’s critical that standardisation is maintained over the longer term regardless of personnel changes, such that baseline strategies are deployed and monitored according to standards and quality assurance rules. To ensure strategies remain optimum over the asset life, the rationale for each strategy decision is maintained and can be revised, improved or changed as business needs change.

Rapid integration

Time is money. The sooner a reliability strategy can be developed and deployed, the better. An ASM system integrates with an organisation’s existing ERP system for easy, efficient and rapid deployment.

Case in Point

Major LNG operator develops & deploys strategies in only 44 days

The Goal

  • To develop maintenance strategies for a major LNG brownfield operation.

The Situation

  • Had no clear method to develop and standardise maintenance strategies in a rapid and efficient manner for all brownfield assets.
  • Many existing PMs were outdated.
  • Many assets did not have strategies.

How an ASM solution was leveraged

  • Generic maintenance strategies were developed for 122 unique equipment types.
  • Variations were made on generic strategies where applicable to meet asset operating context.
  • Strategies were then uploaded to SAP and deployed to 3,631 assets.

Outcomes

  • Client now has a single, standardised database in which strategies can be quickly updated and uploaded to SAP.
  • Entire process was completed in 44 days vs. 90+ days if a traditional method had been used.
  • Client is going to leverage the strategies and learning from this project to assist with the rollout of an upcoming Greenfield initiative.

Would you like to know how you can leverage your organisation’s pockets of excellence and build a best in class asset strategy management program? OnePM® is an innovative reliability strategy management solution, created by ARMS Reliability. LEARN MORE

OnePM® is a trade mark of ARMS Reliability and registered in Australia.

For the fourth installment in this blog series on the key steps of RCA program development we are covering step #5 – RCA and Solution Tracking Systems and #6 – Roles and Responsibilities. Remember, to have a successful implementation and adoption of your new (or redesigned) RCA program, it’s crucial to have all the elements of an effective and efficient program clearly identified and agreed upon in advance. RCA Program Roles and Responsibilities

Here’s what we’ve covered in this blog series so far:

The Key Steps of Designing Your Program

Defining Goals and Current Status

Setting KPIs and Establishing Trigger Thresholds

Now we’ll dive into the important aspects of tracking your RCA solutions and the responsibilities of each role that must be played within your RCA processes.         

RCA and Solution Tracking Systems

It is very important that the status of outstanding RCAs and solutions be monitored for two basic reasons: To assure timely completion of the tasks, and to measure their effectiveness in preventing recurrence of the undesirable incident.

When a formal RCA is triggered and assigned to a facilitator, the assignment should also include a reasonable completion date that is agreed upon with the facilitator. I emphasize the word “reasonable” because human nature is to cut ourselves short on schedule time leading to rushing and taking shortcuts, which in turn can result in an inferior work product.  I’ve not known anyone that has ever been admonished for completing a task ahead of time, so be sure the due date is readily achievable.

Once an RCA has been completed, a list of possible solutions will be submitted for approval.  As with the RCAs, solution implementation should have a realistic due date and a responsible party for completion. Progress towards meeting the due dates must be tracked to ensure timely completion. In addition, solution effectiveness, as measured by incident recurrence (or hopefully the lack thereof), should be monitored as well in order to demonstrate the value of both the individual RCAs and the RCA program as a whole.

When discussing action tracking, often the first thought is to create a new system. However, I suggest thinking about your existing systems and technology first and whether there is an opportunity (and if it makes sense) to integrate action tracking into those. Solution and action item lists can be exported from RealityCharting® to Excel® and then imported into your current action tracking system. If you don’t have an existing tracking system that will serve all your needs, then consider an enterprise RCA tool, such as RC ProTM.

Roles and Responsibilities

An effective, efficient, and sustainable RCA effort will require a number of functions to be fulfilled at various levels of the organization. There must be distinct RCA program roles to accomplish this, each with either a unique set of responsibilities or, in some cases, shared duties. The responsibilities of each role must be clearly understood to assure that the appropriate personnel are assigned to the proper roles while at the same time balancing existing position duties with any added RCA workload.

Here are the necessary functions:

  1. Steering committee – What existing management team will oversee the RCA effort?
  2. Program Champion – Who will be the RCA sponsor that brings legitimacy to the effort?
  3. RCA Methodology Expert – Who will quality-check completed RCAs or facilitate the most difficult ones?
  4. RCA Facilitators – How many and who will be trained as standard RCA facilitators?
  5. First Responders – When a triggered event occurs, who will be responsible for preserving evidence?
  6. Skills Participants – What people will participate in RCAs?

We will go into more depth on each of the above in our next blog installment.

Learn more about our recommended facilitated workshop that covers all 11 of the key steps, and contact us for more information.

 

Author: Jason Ballentine

Developing a maintenance strategy requires careful consideration and due process. Yet from what I’ve seen, many organizations are making obvious errors right from the start — missteps that can torpedo the success of the strategies they’re trying so hard to put in place.??????????????????????????????????????????

Without further ado, here are five common maintenance strategy mistakes:

  1. Relying solely on original equipment manufacturer (OEM) or vendor recommendations.

It seems like a good idea — you’d think the people who made or sold the equipment would know best. It’s what they don’t know that can hurt you.

Outside parties don’t know how a piece of equipment functions at your facility. They don’t understand how much this equipment is needed, the cost of failure, whether there’s any redundancy within the system… OEM and vendor maintenance guidelines are geared to maximize the availability and reliability of the machine, but their strategies might not be appropriate for your unique circumstances or needs. As a result, your team could end up over-maintaining the equipment, which can actually create more problems than it solves. The more you mess with a piece of equipment, the more you introduce the possibility of error or failure. Some things, in some situations, are better left alone.

What’s more, OEMs and vendors have a vested interest in selling more spare parts (so they can make more money). That means that their replacement windows might not be accurate or appropriate to your business needs. Rather than relying on calendar-driven replacement, your maintenance strategy might focus more on inspecting the equipment to proactively identify any issues or deterioration, then repairing or replacing only as needed.

It’s fine to use OEM/vendor maintenance guidelines as a starting point. Just make sure you thoroughly review their recommendations to see if they align with your unique needs for the given piece of equipment. Don’t just blindly accept them — make sure they fit first.

  1. Relying heavily on generic task libraries for your maintenance strategy.

This is surprisingly common. Some organizations purchase a very generic set of activities for a piece of equipment or equipment category, and attempt to use them to drive maintenance strategy. But generic libraries are even worse than OEM/vendor recommendations because they are just that — generic. They aren’t written for the specific equipment make and model you have. They might even include tasks that simply don’t apply, such as “inspect the belt” on a pump that uses an entirely different drive mechanism. Once a mechanic attempts to perform one of these generic, ill-suited tasks, he or she stops trusting your overall maintenance strategy. Without credibility and compliance, you might as well not have a strategy at all.

Like OEM and vendor recommendations, generic task libraries can help you get started on a robust maintenance strategy, if (and only if) you carefully examine them first and only use the tasks that make sense for your particular equipment and operational needs.

  1. Failing to include a criticality assessment in your strategy decisions.

If you choose and define tasks without factoring in criticality, you run the risk of wasted effort and faulty maintenance. Think about it: If a piece of equipment is low on the criticality scale, you might be okay to accept a generic strategy and be done with it. But for equipment that’s highly critical to the success of your operations, you need to capture as much detail as possible when selecting and defining tasks. How can you know which is which without fully assessing the relative importance of each piece of equipment (or group of equipment) to the overall performance of your site?

  1. Developing maintenance strategies in a vacuum.

Sometimes, organizations will hire an outside consultant to develop maintenance strategies and send them off to do it, with no input from or connection with the maintenance team (or the broader parts of the organization). Perhaps they figure, “you’re the expert, you figure it out.” Here’s the problem: For a maintenance strategy to be successful, it must be developed within the big picture. You’ve got to talk to the mechanic who’ll be doing the work, the planner for that work, and the reliability engineer who’ll be responsible for the performance of that equipment, production, or operation. Their input is extremely valuable, and their buy-in is absolutely critical. Without it, even the best maintenance strategy can be met with resistance and non-compliance.

  1. Thinking of maintenance strategy development as a “one-and-done” effort.

For some organizations, the process of developing a maintenance strategy from the ground up seems like something you do once and just move on. But things change — your business needs change, the equipment you have on site changes, personnel changes, and much more. That’s why it’s vitally important to keep your maintenance strategies aligned with the current state of your operations.

In fact, a good maintenance strategy is built with the idea of future revisions in mind. That means the strategy includes clear-cut plans for revisiting and optimizing the strategy periodically. A good strategy is also designed to make those revisions as easy as possible by capturing all of the knowledge that went into your strategy decisions. Don’t just use Microsoft Word or put tasks directly into the system without documenting the basis for the decisions you made. What were your considerations? How did you evaluate them? What ultimately swayed your decision? In the future, if the key factors or circumstances change, you’ll be able to evaluate those decisions more clearly, without having to guess or rely on shaky recall.

If you’ve found yourself making any of these mistakes, don’t despair. Most errors and missteps can be addressed with an optimization project. In fact, ARMS Reliability specializes in helping organizations make the most of their maintenance strategies. Contact us to learn more.

Engineering Support_Web Banner

bigstock--165000134.jpgAs outlined in our previous blog article, “RCA Program Development: The Key Steps of Designing Your Program”, there are 11 key steps to a successful RCA program. Last month we introduced the first two steps – Defining Goals and Current Status. In this article we’ll break down steps 3 and 4 – Setting KPIs for your RCA program and establishing trigger thresholds to initiate an RCA.

  1. Key Performance Indicators

Key Performance Indicators, or KPIs, are the benchmarks used to measure the success of a program or effort. They can generally be divided into two categories: leading indicators and lagging indicators.  Both of these measure the degree to which progress is being made in achieving a specific goal.  Leading indicators tend to be objectives that progress you towards achieving the ultimate goal. They can be measured over a short period and act as mileposts to gauge how you’re tracking towards your goal. Lagging indicators are often the goals themselves.  If the relationship between the two is correctly defined, then achieving the short-term (leading) indicators virtually guarantees achieving the long-term goals.

To provide perspective in measuring progress using KPIs, a baseline must first be established.  Baselines for the selected KPIs should be at least 3 years of historical performance. Once these are established, then goals or targets for improvement should be set for a period of time, say 3 years, going forward. This process should be reviewed at least annually with baselines and targets adjusted accordingly.

  1. Formal RCA Threshold Criteria

An effective incident prevention program will have RCAs being performed at two levels: 1) On an informal or ad hoc basis for smaller, nuisance-level problems that may be specific to individuals or departments; and 2) on a formal level where challenges to the organization’s goals exist.

Leaders must communicate the organizational trigger criteria but they should also encourage and support teams and individuals to set their own trigger criteria as well.  When your employees can solve smaller day-to-day problems more effectively, your organization will realize the benefits of pro-active problem solving because many smaller problems will be rectified before they can manifest themselves into larger organization-level problem.

For RCA to be a core competency at all levels of the organization, and for people to be proactively preventing organizational problems, it is important to have clear guidance for formal RCAs. This is the function of the Trigger Criteria diagram. High-level challenges should be formally identified and assigned a threshold that when exceeded will automatically trigger a formal RCA.  Triggers should generally be leading indicators of some form or another and derived from specific organizational goals, or KPIs.  They are the trip wires to engage the RCA process for finding solutions to problems that are inhibiting organization goal achievement.

Organizations at higher levels of maturity will most often have triggers for multiple categories including safety, environmental compliance, revenue loss, unbudgeted costs, production loss, and sometimes repeat incidents.

For a deeper dive into the topic of trigger thresholds and scaling your RCA investigation, check out our whitepaper “Matching the Scale of Your RCA Investigation to the Significance of the Incident

In this blog series, we’ve now covered:

And

  • Setting KPIs and Establishing Trigger Thresholds

But of course, there is more to setting up your RCA program for success. ARMS Reliability’s RCA experts can assist you with designing your complete RCA program or reinvigorating your current one. This of course includes assisting with determining the status of your current RCA effort, walking you through the process of establishing and aligning goals, helping you set KPIs for your program, and establishing trigger thresholds that make sense for your organization. Learn more about our recommended facilitated workshop that covers all 11 of the key steps, and contact us for more information.

 

RCA201_At A Glance.jpg

training_footer_ad1-resized-600.jpg

Author: Jason Ballentine

Many organizations believe that making sound maintenance decisions requires a whole lot of data. It’s a logical assumption — you do need to know things like the number of times an event has occurred, its duration, the number of spare parts needed, and the number of people engaged in addressing the event; plus the impact on the business and the reason why it happened. ????????????????????????????????????????

A lot of this information is captured in your Computerized Maintenance Management System (CMMS). The more detail you have, the more accurate results you can get from maintenance scenario simulation tools like Isograph’s Availability Workbench™. Unfortunately, your CMMS data may be lacking enough detail to yield optimal results.

It’s enough to make anybody want to throw his or her hands up and put off the decision indefinitely. If you do, you could be making a big mistake.

No matter what, you’re still going to have to make a decision. You have to.

The truth is, you can still do a lot with limited or poor quality data, supported by additional sources of knowledge. Extract any and all information you have available, not just what is in the CMMS. Document what you’ve got, then use it to make a timely decision that’s as informed as possible.

Don’t get caught up in the fact that it’s not perfect data — circumstances in the real world are hardly ever ideal. In fact, as reliability engineers, most of the data we get is related to failure, which is exactly what we’re trying to avoid. Actually, if we are tracking failures, having less data means we are likely doing our jobs well because that means we are experiencing a low number of failures.

The bottom line is: we can’t afford to sit and wait for more data to make decisions, and neither can you.

Gather as much information as you can from all available sources:

CMMS

In an ideal world, this is the master data record of all activities performed.  As discussed previously, that is almost never the case; however, this is an important starting point to reveal where data gaps exist.

Personal experience and expertise

There’s a wealth of information stored within the experience of people who are familiar with any given piece of equipment. Consider holding a facilitated workshop to gather insight on the equipment’s likely performance. Even a series of informal conversations can yield useful opinions and real-world experiences.

The Original Equipment Manufacturer (OEM)

Most OEMs will have documentation you can access, possibly also a user forum you can mine for additional information.

Industry databasese.g., the Offshore and Onshore Reliability Data Handbook (OREDA) and Process Equipment Reliability Database (PERD) by Center for Chemical Process Safety (CCPS)

Some information is available in these databases, but it’s generic — not specific to your unique site or operating context. For example, you can find out how often a certain type of pump fails, but you can’t discover whether that pump is being used on an oil platform, refinery, power station or mine site. Industry data does, however, provide useful estimates on which you can base your calculations and test your assumptions.

Capture all these insights in an easily accessible way, then use what you’ve learned to make the best decision currently possible. And be sure to record the basis for your decision for future reference. If you get better data down the road, you can always go back and revise your decisions — after all, most maintenance strategies should remain dynamic by design.

Don’t let a lack of data paralyze you into inaction. Gather what you can, make a decision, see how it works, and repeat. It’s a process of continuous improvement, which given the right framework is simple and efficient.

Availability Workbench™, Reliability Workbench™, FaultTree+™, and Hazop+™ are trademarks of Isograph Limited the author and owner of products bearing these marks. ARMS Reliability is an authorized distributor of those products, and a trainer in respect of their use.

Reliability Summit_Blog_Web Banner

Author: Jason Ballentine

As with any budget, you’ve only got a certain amount of money to spend on maintenance in the coming year. How do you make better decisions so you can spend that budget wisely and get maximum performance out of your facility? ??????????????????????????????????????????????

It is possible to be strategic about allocating funds if you understand the relative risk and value of different approaches. As a result, you can get more bang for the same bucks.

How can you make better budget decisions?

It can be tempting to just “go with your gut” on these things. However, by taking a systematic approach to budget allocation, you’ll make smarter decisions — and more importantly you’ll have concrete rationales for why you made those decisions —  which can be improved over time. Work to identify the specific pieces of equipment (or types of equipment) that are most critical to your business, then compare the costs and risks of letting that equipment run to failure against the costs and risks of performing proactive maintenance on that equipment. Let’s take a closer look at how you can do that.

4 steps to maximize your maintenance budget

1.  Assign a criticality level for each piece of equipment. Generally, this is going to result in a list of equipment that would cause the most pain — be it financial, production loss, safety, or environmental pain — in the event of failure. Perform a Pareto analysis for maximum detail. 

2.  For your most critical equipment, calculate the ramifications of a reactive/run-to-failure approach.

  • Quantify the relative risk of failure. (You can use the RCMCost™ module of Isograph’s Availability Workbench™ to better understand the risk of different failure modes.)
  • Quantify the costs of failure. Keep in mind that equipment failures can affect multiple aspects of your business in different ways — not just direct hard costs. In every case, consider all possible negative effects, including potential risks.
    • Maintenance: Staff utilization, spare parts logistics, equipment damage, etc.
    • Production Impact: Downtime, shipment delays, stock depletion or out-of-stock, rejected/reworked product, etc.
    • Environmental Health & Safety (EHS) Impact: Injuries, actual/potential releases to the environment, EPA visits/fines, etc.
    • Business Impact: Lost revenue, brand damage, regulatory issues, etc.

For a more detailed explanation of the various potential costs of failure, consult our eBook, Building a Business Case for Maintenance Strategy Optimization.

3.  Next, calculate the impact of a proactive maintenance approach for this equipment

  • Outline the tasks that would best mitigate existing and potential failure modes
  • Evaluate the cost of performing those tasks, based on the staff time and resources required to complete them.
  • Specify any risks associated with the proactive maintenance tasks. These risks could include the possibility of equipment damage during the maintenance task, induced failures, and/or infant mortality for newly replaced or reinstalled parts.

4. Compare the relative risk costs between these approaches for each maintenance activity. This will show you where to focus your maintenance budget for maximum return.

When is proactive maintenance not the best plan?

For the most part, you’ll want to allocate more of your budget towards proactive maintenance for equipment that has the highest risk and the greatest potential negative impact in the event of failure. Proactive work is more efficient so your team can get more done for the same dollar value. Letting an item run to failure can create an “all hands on deck” scenario under which nothing else gets done, whereas many proactive tasks can be performed quickly and possibly even concurrently.

That said, it’s absolutely true that sometimes run-to-failure is the most appropriate approach for even a critical piece of equipment. For example, a maintenance team might have a scheduled task to replace a component after five years, but the problem is that component doesn’t really age -— the only known failure mode is getting struck by lightning. No matter how old that component is, the risk is the same. Performing replacement maintenance on this type of component might actually cost more than simply letting it run until it fails. (In these cases, a proactive strategy would focus on minimizing the impact of a failure event by adding redundancy or stocking spares.) But you can’t know that without quantifying the probability and cost of failure.

Side note: Performing this analysis can help you see where your maintenance budget could be reduced without a dramatic negative effect on performance or availability. Alternatively, this analysis can help you demonstrate the likely impact of a forced budget reduction. This can be very helpful in the event of budget pressure coming down from above.   

At ARMS Reliability, we help organizations understand how to forecast, justify and prioritize their maintenance budgets for the best possible chances of success. Contact us to learn more.

Availability Workbench™, Reliability Workbench™, FaultTree+™, and Hazop+™ are trademarks of Isograph Limited the author and owner of products bearing these marks. ARMS Reliability is an authorised distributor of those products, and a trainer in respect of their use.