What I want to know from my failure data is “where” to focus limited resources to make the largest impact now.

It is that simple. And as a rule KISS (Keep it simple stupid) applies.

For me – “if” I am collecting failure data , I want to use it to prevent failures in the most useful way. ISO 55000 will be released next year and that standard for asset management includes requirements for robust FRACAS (Failure Reporting and Corrective Action System).

The intent of the ISO 55000 standard is to use the data to improve the maintenance and reliability decision making that went into the FMEA process that developed your maintenance strategy in the first place.

Kind of like the RCM (Reliability Centered Maintenance) “living program”.

Oddly, I have been part of some organizations that collected data and rarely used it. I have also seen failure data that was collected, but proved to be useless for analysis. It is pretty sad to sit down to heaps of useless data, but it happens because we fail to understand what data is required to make a meaningful impact.

Not using the failure data is waste and not very LEAN. Collecting failure data has a cost. Collecting failure data and not using it is also not very LEAN.

Failure data goes unused too often

To be useful, you need to be able to sort the failure data to focus on the highest problem first, pareto, 80-20 rule approaches. This implies classification of the data records using an index (cause codes). Ideally these cause codes align 1:1 with your FMECA cause codes, and you have used the full power of the ERP cause code catalog system.

Far too often we implement cause catalogs that are easy for the IT team to install and maintain, and not those that are easy for the technician to use, and code.

Failure codes need to revert back to the most probable credible events in that specific area of the machine.

Implementing a common catalog for an entire plant is just about the same waste of time that goes into fruitless searching for a catalog code from a list of several hundred.

You see, you cannot work on every failure mode in your plant all the time. There are not enough people in your team. So after you sort the data, you need to focus on the highest rate of return for your efforts, the low hanging fruit.

Usually this implies you work on the worst repeat offender first, because it is happening OFTEN.

To be recognized as a true contribution to the organization we need to invoke one of the theory of constraints principles and subordinate all resources to the current bottleneck. If you focus anywhere else, while you have improved that machines performance, you have made no more money, thus you need to focus on your plants current bottleneck, and work to mitigate those failure modes first.

It makes little sense to mitigate failures on a non bottleneck unit, that has excess capacity if your bottleneck unit(s) still have failure modes to mitigate during a sold out market condition.

During economic hardship times, like now, all units most likely have excess capacity, and thus working on mitigating failure modes to increase capacity which cannot be sold, ALSO makes little sense, so highest recurring cost should be the focal point of attack when times are tough.

To be easily used, the failure data needs to be well classified by cause codes. This implies the cause code is a required field for your CMMS on work types that represent failure data discussed during the FMECA that created your maintenance strategy. If times are tough, if you intend to focus on cost based data sorts you of course need to capture costs routinely.

About 15 codes is all I recommend, because if you employ too many, you have such granularity, that you can never sort to find the high hitters.

There is a relationship between the number of codes you deploy to classify and how long you have to wait (endure failures) to get enough data to sort (more than one failure in a category).

Finding the right number of codes to use is key to getting the most out of your failure data

If you need more than 15, your failure code catalog design is broken, and you need to revisit your FMECA to see how to better segment your failure catalog.

In a way, if you have avoided RCM and the FMEA, you have probably disabled your ability to effectively utilize failure codes. Interesting how it all weaves together.

I have used failure data classified by only 5 codes, but that also required reading into the long text memo’s to really glean what was up, and effectively sub classify the failure codes by manual interpretation.

End of the day:
We want to alter the maintenance strategy OR redesign the machine. As we know each machine has an inherent reliability, above which no amount of maintenance can improve, so if the maintenance strategy is not deficient, then we need to alter the design to eliminate the failure mode.

So, the next time a machine breaks in your plant – ask the technician to show you how he/she codes the failure. If you have > 15 codes, or codes that are irrelevant, or codes that make absolutely no sense, call me.

About Philip Sage

SAP PM Expert - Design Architect of Green Field SAP and Reliability Plant. SAP MM QM, HR, PP extremely knowledgeable. Configured SAP DMS and integrated with SAP PM.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Post Navigation