Yearly Archives: 2015

You are browsing the site archives by year.

By Kevin Stewart

Many of the Apollo Root Cause Analysis methodology training instructors often get asked the same question – “how long should it take to do a Root Cause Analysis (RCA) investigation?”  This is a difficult question to answer due to the variables associated with each individual RCA.   It’s a lot like asking someone, “How long will the trip take?”  How do you begin to answer that? Some questions that come to mind are – to where? Or how will you be traveling? Or what route will you take?  Or will you be stopping anywhere? And so on.

If it is so variable, how can we even talk about whether an RCA should take several days or not?  There are two general paths in the utilization of the Apollo methodology, let’s call them “long” and “short.” Since this article is about RCAs not taking several days, let’s focus on the short one. bigstock-Calendar-And-Clock-Time-Circle-83476289.jpg

Most people envision the Apollo Root Cause Analysis methodology as a large group of people in a conference room for several days as a necessary means to finding a valid solution.  It is true that many RCA investigations do take four to five solid, eight-hour days to determine an appropriate solution, but these should be problems that have a large significance where information may not be readily available.

I always point out to my students that not only is it possible to do an Apollo Root Cause Analysis in a short time, but I have personally done several that took less than a day.  How?

The Apollo Root Cause Analyisis process involves a specific methodology of asking “why?” or “caused by ____?” and then identifying an appropriate answer, writing it down, and then asking “why” again.  You do this until you are stymied with no answers or reach a point where it doesn’t make sense to ask “why” anymore.  This process does not change regardless of the type or the size of the problem, or for any other reason.

Many of you may have heard of the “Five Whys” as an RCA process.  This was designed for small problems experienced by operators on the line at Toyota facilities.  These little RCAs were done in the moment by people involved in the incident.  If you’re familiar with both the Apollo Root Cause Analysis methodology and Five Whys process you may notice that they are very similar. Many times I point out to students that you can see several “Five Whys” branches inside any Apollo RCA chart. So it stands to reason that the Apollo Root Cause Analysis methodology can be used in a similar fashion to the Five Whys.

Here’s an example.  I was responsible for the reliability of a production area of a plant during my career.  It was not uncommon to find me walking around looking for problems, and during one such time I discovered some people working hard to unplug a jammed conveyor.  It was plugged with a 1,000-pound solid carbon block wedged in between some posts, and there was no good access to the block with a crane or other lifting device.  When they spotted me I got an earful; apparently this had been happening on a regular basis.  The specific frequency was unknown, but the emotion of the operator told me that it was at least once per shift.  I promised to fix it for him and he calmed down, they got the unit unplugged and back on line, and he went back to his job just downstream of the jam.

Since I promised to fix this, I decided to spend some time at the unit to see if I could observe what was causing the jam.

The Apollo Root Cause Analysis process went like this:


If you start the RCA chart in your mind, you quickly get to a dead end because no one could see why the jam had happened.  The operator in the area was busy doing his job, which required constant attention—pouring molten metal into a small cavity to “glue” a copper rod to the top of an anode.   This was done while the line was moving; he poured one about every 15 seconds so he really couldn’t be looking around.  There were not a lot of other spare personnel in the area that could spend the time looking, so I decided that was my job. bigstock-Man-in-a-safety-hat-taking-not-64077667_Resized.jpg

These blocks where pushed onto an automated system by a large pusher that had a paddle hanging down from a cylindrical steel piece with a bushing, since the paddle was designed to float.  It seemed pretty obvious that the pusher had something to do with it… but how?  After they started up the system, it worked like a charm just as designed, no glitches.  Intermittent problems are some of the hardest to fix because you need to be there when things go awry or gather data to identify the causes.

So there I was with one cause on my box – “Block jammed caused by ____?” I thought perhaps if I watched it I’d get lucky enough to catch the issue.  So I stood there, and stood there, and stood there for perhaps an hour. Nothing.  I didn’t want to leave quite yet but it did seem like a waste of time, so I decided to check out other items in the area.  I spent an hour or so away from the machine and then went back. Upon returning to the unit there didn’t seem to be anything obviously out of order.  However, something seemed different, though I couldn’t put my finger on it.

After spending another hour away and then coming back again, this time I noticed what appeared to be a difference: slight, but I was pretty sure it was happening.  One more hour away and then back and sure enough something was happening over a long period of time.

Now I just needed to verify my suspicions.  Believing I knew the cause, I figured I had enough time to go to lunch and do some more office work before returning to the unit to check my theory and gather evidence.  I was correct.

The cause of the issue was that the paddle was rotating counter-clockwise on the shaft ever so slightly with every push.  It was taking more than six hours for it to rotate enough to push on the corner of the block, shove it sideways off the conveyor, and cause the jam.  So my chart looked like this after about six to seven hours:

chart_1.pngAt this point I alerted everyone to the issue, and the maintenance personnel came over and safely moved the paddle back so the shift could finish.  Our facility had a swing shift crew that worked in the area after the production was done, so they were assigned the task of fixing the unit.

That evening they removed the unit, checked everything against the drawings and specifications, and found that the tolerance on the bushing was incorrect.  It was close, but the tolerance was tight enough that each push that was not exactly dead-on caused a slight twisting force, moving the paddle off course and eventually causing a jam. The team fixed the tolerance issue and put it back in place by the next shift start.

So my chart now looked like this:


This whole process took less than eight hours to complete but was spread out over two days.  If you look at my total time involvement it was perhaps four hours. (I am not charging the process with time that I was multitasking by doing other things.)

So as you can see, an RCA investigation doesn’t always have to take days.  Of course, some will take several days and you could stretch even a simple investigation into a longer process if you wish. But if you are close to the problem, get accurate information, act quickly, and stick with the process, you can do an RCA quickly and get an effective solution.


In most cases, there is much to gain by working through maintenance strategy optimization. To identify where your company’s maintenance strategy sits on the spectrum, you can perform a simple self-assessment that looks for the most common symptoms, which are described in detail in our guide “5 Symptoms Your Maintenance Strategy Needs Optimizing.” If the symptoms are evident, then there is a strong business case to invest in maintenance strategy optimization. The primary question in diagnosing the health of your maintenance strategy is a simple one. Does your maintenance strategy need optimizing? Ideally, your maintenance strategy is already optimized. Perhaps it was, but is in need of a tune-up. Or, as is the case in many companies, maybe you are experiencing endemic symptoms that lead to: M

  • Recurring problems with equipment.
  • Budget blow-outs from costly fixes to broken equipment.
  • Unplanned downtime that has a flow-on effect on production.
  • Using equipment that is not performing at 100 percent.
  • Risk of safety and environmental incidents.
  • Risk of catastrophic failure and major events.

To identify where your company’s maintenance strategy sits on the spectrum, you can perform a simple self-assessment that looks for the most common symptoms.

  1. Increase in unplanned maintenance – A sure sign that your maintenance strategy is not working is the simple fact that you are performing more unplanned maintenance, which is caused by an increase in the occurrence of breakdowns.
  2.  Rising maintenance costs – In companies that apply best practice maintenance strategy optimization, total maintenance costs are flat or slightly decreasing month-on-month. These optimized strategies combine preventative tasks with various inspection and root cause elimination tasks which in turn produces the lowest cost solution.
  3. Excessive variation in output – A simple definition of the reliability of any process is that it does the same thing every day. In other words, equipment should run at nameplate capacity day in and day out. When it doesn’t, this is an indication that some portion of the maintenance strategy is misaligned and not fully effective.
  4. Strategy sticks to OEM recommendation -Sticking to the maintenance schedule prescribed by Original Equipment Manufacturers (OEMs) may seem like a good starting point for new equipment. But it’s only that a starting point. There are many reasons why you should create your own optimized maintenance strategy soon after implementation.
  5. An inconsistent approach – Consistency implies lack of deviation. And this implies standardisation. When it comes to maintenance strategies, standardization is essential.

For an in depth look at these symptoms download the complete guide “5 Symptoms Your Maintenance Strategy Needs Optimizing” 

If your maintenance activities have a large proportion of reactive repairs then the costs of maintaining your assets are larger than they need to be, because the cost of performing unplanned maintenance is typically three times the cost of performing maintenance in a planned manner. Furthermore, if your system is reactive, it is a sign that you are not managing failures. Your biggest costs may be catastrophic failure, systemic failure or equipment defects.Proactive x Reactive creative sign with clouds as the background

These major meltdowns or one-off events can cost millions of dollars in reactive repairs, lost production and/or major safety/environmental impacts. If you need to lower the cost of maintenance this is an area you can make a significant impact on the P&L.

Proactive maintenance – which is aimed at avoiding such scenarios – is a much more cost-effective approach.

First, what is reactive maintenance? Put simply, it is any maintenance or repair done to a piece of equipment after a failure event. If a gear-box grinds to a halt and your maintenance team rushes to repair it, they are engaging in reactive maintenance.

While the immediate cost of such maintenance may seem low – a day of labor and the purchase of a new part for the machine – the flow-on costs associated with downtime, lost production can be much higher and there is a greater risk of safety and environmental incidents during the shutting down or starting up of equipment.

In companies where reactive maintenance is a large proportion of work performed, there are many hidden costs carried by the business such as higher inventories; premium rates for purchasing spare parts; higher stocking levels for critical spares; more wasted time queuing for tools, materials, and labor; higher overtime levels; more plant downtime; interruption to customer orders; stockouts; offspec quality.  The organization and management system has a short term, busy focus often under budget pressure, variations in production, and lots of “things to do”.


On the flip side, proactive maintenance takes a preventative approach. It involves making assets work more efficiently and effectively so that downtime and unexpected failures become a thing of the past. It’s also about trimming unnecessary expenditure from asset management budgets. From a bottom line perspective, it’s about boosting the assets’ contribution to earnings before interest and tax (EBIT).

Strategies associated with proactive maintenance involve understanding and managing the likelihood of failures, some of the common analytical methods to understand the impact of failures on the business include:

  • System Analysis – to understand the way equipment failures can impact the availability and production capacity of a system; it allows the analyst to identify and eliminate potential bottlenecks in a system, and thus increase plant capacity
  • Criticality Analysis – to rank equipment by the likelihood and severity of failure impact on key business objectives, so you can then channel maintenance resources into the more critical pieces of equipment
  • Maintenance Benefit Analysis – to evaluate a maintenance plan and identify areas where maintenance is either not needed or not optimal.
  • Spares Optimization – to find the optimum level of spares to hold in-stock, which balances the cost of not having spares available versus taking up storage space on-site
  • Repair Vs Replace Analysis – to predict or track the cost of repairs against the cost of replacement, so it becomes clear when to replace assets for best value
  • Root Cause Analysis – to analyze the root cause of failures and focus resources on eliminating their reoccurrence, not just fixing the symptoms time and time again.
  • Vulnerability Analysis- to systematically review all aspect of the operation in a way to discover tomorrow’s failure, so it can be eliminated in a planned fashion.

As these strategies attest, proactive maintenance is about much more than building a schedule of ongoing maintenance tasks. By understanding and managing failure the maintenance resources can be directed to those areas that require attention in a planned manner, you can actually save significant amounts of money into the long term.

And, above all, it is important to remember that a culture of reactive maintenance is not ideal. In fact, unplanned reactive maintenance is one of the key symptoms that your maintenance strategy isn’t working.

Learn more by downloading our guide: 5 Symptoms Your Maintenance Strategy Needs Optimizing

Enquanto há três razões principais pelas quais as organizações normalmente executam Análise Causa Raiz (RCA) após um problema com o seu activo ou equipamento, há toda uma série de outros indicadores que indicam que o RCA deve ser executadas.

As probabilidades são, você está registrando um monte de informações valiosas sobre o desempenho do seu equipamento – informações que possam revelar oportunidades para realizar análise causa raiz, encontrar as causas e implementar soluções que vão resolver problemas recorrentes e melhorar as operações. Mas você está usando suas informações registradas nessa medida?

Primeiro, vamos falar rapidamente sobre três razões pelas quais o análise causa raiz é normalmente realizado. Cartoon_Man/HardHat

  1. Porque você tem que

Pode haver um requisito regulamentar para demonstrar que você está fazendo algo sobre um problema que ocorreu.

  1. Você ultrapassó um ponto gatilho

Sua própria empresa identificou os gatilhos para incidentes importantes que justifiquem a Análise Causa Raiz.

  1. Porque você quer

Uma oportunidade apresentou-se para fazer mudanças para melhorar. Ou talvez você decidiu que você simplesmente não quer perder tanto dinheiro o tempo todo.

No cerne de toda a indústria é o desejo de ganhar dinheiro. Qualquer coisa que afeta negativamente esse objetivo geralmente é atacada através da realização de Análise Causa Raiz.

Eu estava tendo uma conversa com um engenheiro de confiabilidade em um site de petróleo e gás, e eu perguntei-lhe o que oportunidade perdida ou tempo de parada poderia-lhe custar a essa empresa ao longo de um ano. Ele disse que era nas proximidades de três quartos de um bilhão de dólares. $ 750.000.000. Será esta uma razão boa o suficiente para realizar a Análise Causa Raiz? Mesmo uma mudança de 10% teria um enorme impacto sobre números de linha de resultados.

O impacto monetário para o negócio foi, naturalmente, não devido a qualquer evento único, mas a uma infinidade de eventos grandes e pequenos.

Cada evento apresenta-se como uma oportunidade para aprender e para fazer as alterações necessárias para evitar a sua repetição. Uma vez que podem ser considerados como casualidade … as coisas acontecem, graves ou menores, e isso é a vida. Mas, para deixar que isso aconteça de forma contínua significa que algo está seriamente errado.

Embora todos estes são motivos válidos para executar Análise Causa Raiz, existem pelo menos mais dez indícios reveladores relacionados com equipamentos de que um RCA precisa acontecer – a maioria dos quais podem ser identificados através da informação que você provavelmente já está registrando.

Aqui estão dez indícios reveladores de que sua organização necessita realizar Análise Causa Raiz:

  1. Aumento do tempo de parada de planta, equipamento ou processo.
  2. Aumento de falhas recorrentes.
  3. Aumento de horas extras devido a falhas não planejadas.
  4. Aumento do número de eventos gatilho.
  5. Menor disponibilidade de equipamentos.
  6. Alto nível de manutenção reativa.
  7. Falta de tempo … simplesmente não pode fazer tudo o que precisa ser feito.
  8. Aumento do número de eventos graves … que se aproximam ao topo da pirâmide.
  9. Duração das “paradas” maior do planejado.
  10. Requerimento de “paradas” mais frequêntes.

Estes indicadores implicam que nós precisamos de fazer mais no campo do Análise Causa Raiz antes de estas questões actúem como bola de neve.

We occasionally receive questions to clarify the difference between our Incident Investigation training course and our RCA Facilitator course so we thought we would address some of the most commonly asked questions in this Q&A-style article. We hope you find it helpful. And if you have any questions, as always, please don’t hesitate to contact us.

How is the “Incident Investigation” course different from the “RCA Facilitator” course?

The Incident Investigation course covers the process of identifying, obtaining, documenting, and preserving the raw data related to an incident, then constructing a general timeline of the incident.

The RCA Facilitator course then trains our students on how to sort through this data using cause and effect principles to identify causes that are relevant and formulate workable solutions, or preventative measures, to prevent recurrence. There is also emphasis in the RCA Facilitator course on facilitation skills necessary to conduct the Apollo Root Cause Analysis methodology process.

What are the benefits of the Incident Investigation course?

Without truly understanding the key elements and possessing the necessary skills to conduct a thorough, effective investigation, people run the risk of missing key causal factors of an incident while conducting the actual analysis. This could potentially result in not identifying all possible solutions including those that may be more cost effective, easier to implement, or more effective at preventing recurrence. The Incident Investigation course equips students with the knowledge and skills to conduct a proper investigation to prevent this from happening.

Why it is important for people to attend the Incident Investigation course?

Students will learn the key elements and develop the skills necessary to conduct and document a thorough, effective investigation ensuring all the pertinent information is available for the actual root cause analysis process.

Students will learn:

  • The nature of undesirable incidents and why they often repeat
  • The value of a thorough, effective investigation – Why spend the time?
  • Investigation lead and team selection – Matching individual traits and skill sets to the needs of the investigation
  • The roles, or functions that must be filled to ensure thoroughness and reliability of data
  • Possible sources of incident information and how to optimize the value and reliability of incident facts and evidence
  • Demonstrations of misconceptions about the reliability of evidence and how to avoid them
  • Critical interviewing skills for discovering valuable incident information without inadvertently tainting the outcome
  • Options to ensure timely incident response so that valuable evidence can be preserved and collected
  • The value of developing and using standard templates for use throughout the investigation process
  • How to create an incident timeline using multiple sources of information
  • Importantly, scaling the investigation effort based on the significance of the incident to avoid wasting precious resources while ensuring investigation thoroughness
  • Hands-on individual and group exercises for practicing the key elements of the knowledge and skills listed above

Are there any prerequisites for the Incident Investigation course?

No. Students can take the Incident Investigation course on its own, or combine it with the RCA Facilitator course if they wish to learn the ins-and-outs of the Apollo Root Cause Analysis methodology as well. The Incident Investigation course is designed to stand on its own and depending on a person’s role, they may only need to attend one or the other, or both.

Is there an option to train my team in Incident Investigation via a private, onsite course?

Yes. We can work with a team within a company and create a customized Incident Investigation training course that takes into account their specific processes, triggers, industry, regulations, goals, stats of their HSE incidents, and incident severity tiers, and develop a course to their definitions and templates that can then be used to train staff across the company.

For more information, please contact us.



Can you quantify the financial impact of your maintenance program on your business? Do you take into account not only the direct costs of maintaining equipment, such as labour and parts, but also the costs of not maintaining equipment effectively, such as unplanned downtime, equipment failures and production losses?

The total financial impact of maintenance can be difficult to measure, yet it is a very valuable task to undertake. It is the first step in finding ways to improve profit and loss. In other words, it is the first step towards an optimised maintenance strategy.

In a 2001 study of maintenance costs for six open pit mines in Chile [1], maintenance costs were found to average 44% of mining costs. It’s a significant figure, and it highlights the direct relationship between maintenance and the financial performance of mines. More recently, a 2013 Industry Mining Intelligence and Benchmarking study [2] reported that mining equipment productivity has decreased 18% since 2007; and it fell 5% in 2013 alone. Besides payload, operating time was a key factor.  

So how do you know if you are spending too much or too little on maintenance? Certainly, Industry Benchmarks provide a guide. In manufacturing best practice, benchmarks are less than 10% of the total manufacturing costs, or less than 3% of asset replacement value [3].

While these benchmarks may be useful, a more effective way to answer the question is to look at the symptoms of over- or under-spending in maintenance. After all, benchmarks cannot take into account your unique history and circumstance.

Symptoms of under-spending on maintenance include:

  • Rising ‘hidden failure costs’ due to lost production
  • Safety or environmental risks and events
  • Equipment damage
  • Reputation damage
  • Waiting time for spares
  • Higher spares logistics cost
  • Lower labour utilisation
  • Delays to product shipments
  • Stockpile depletion or stock outs

Other symptoms are explored in more detail in our guide: 5 Symptoms Your Maintenance Strategy Needs Optimizing.

Man in front of computer screen

Figure 1

In most cases, it is these ‘hidden failure costs’ that have the most impact on your bottom line. These costs can be many times higher than the direct cost of maintenance – causing significant and unanticipated business disruption. As such, it is very important to find ways to measure the effects of not spending enough on maintaining equipment.

Various tools and software exist to help simulate the scenarios that can play out when equipment is damaged, fails or, conversely, is proactively maintained. A Failure Modes Effects and Criticality Analysis (FMECA) is a proven methodology for evaluating all the likely failure modes for a piece of equipment, along with the consequences of those failure modes.

Extending the FMECA to Reliability Centred Maintenance (RCM) provides guidance on the optimum choice of maintenance task. Combining RCM with a simulation engine allows rapid feedback on the worth of maintenance and the financial impact of not performing maintenance.

Armed with the information gathered in these analyses, you will gain a clear picture of the optimum costs of maintenance for particular equipment – and can use the data to test different ways to reduce costs. It may be that there are redundant maintenance plans that can be removed; or a maintenance schedule that can become more efficient and effective; or opportunity costs associated with a particular turnaround frequency and duration. Perhaps it is more beneficial to replace equipment rather than continue to maintain it.

It’s all about optimising plant performance for peak production; while minimising the risk of failure for key pieces of equipment. Get it right, and overall business costs will fall.

Want to read on? Download our guide: 5 Symptoms Your Maintenance Strategy Needs Optimizing.


[1] Knights, P.F. and Oyanander, P (2005, Jun) “Best-in-class maintenance benchmarks in Chilean open pit mines”, The CIM Bulletin, p 93

[2] PwC (2013, Dec) “PwC’s Mining Intelligence and Benchmarking, Service Overview”,


Figure 1:  This image shows Isograph’s RCMCostTM software module which is part of their Availability WorkbenchTM. Availability Workbench, Reliability Workbench, FaultTree+, Hazop+ and NAP are registered trademarks of Isograph Software. ARMS Reliability are authorized distributors, trainers and implementors.

Autor: Kevin Stewart

Auditoria es definido por el diccionario como: “una examinación y revisión metódica.” Cuando hablamos de auditar sus investigaciones de Análisis de Causa Raíz (RCA, por sus siglas en Ingles), hablamos justamente de eso – de una examinación y revisión metódica. Esto está más fácil dicho que hecho, especialmente sin alguna medida especifica con la cual comparar. Si establecemos un estándar bajo el cual se puede medir y comparar la calidad del RCA, la auditoria entonces simplemente se convierte en una revisión del RCA comparado con el estándar aceptado y luego se determina que tan bien se sigue dicho estándar. Este artículo se trata de ayudarle a desarrollar un estándar, y además ofrecerle una plantilla gratuita para calificar su actual proceso para ayudarle a empezar con éxito.  RCAInvestigationScoreSheet_SP_Mock-up

¿Se puede tener el peor programa de RCA en el mundo y no alcanzar ningún criterio mencionado, pero tener una solución efectiva que:

  • Prevenga recurrencia,
  • Alcance las metas y objetivos,
  • Este bajo nuestro control, y
  • No causa algún otro problema?

Seguro, y es difícil argumentar en contra del éxito. Dudo que alguien diga: “Aunque esta solución prevenga que el problema recurra, viene de un RCA que no cumple con nuestros rigurosos métricos de alta calidad, entonces no lo podemos emplear.” Este ejemplo es completamente posible, aunque la probabilidad sea diminuta. Si tenemos un conjunto de medidas para comparar nuestro RCA para asegurar que nuestras alcance un estándar de calidad, entonces la probabilidad de que tengas una solución efectiva que provenga de aquel RCA incrementa considerablemente.

¿Entonces, cuales características del RCA con importantes?

Aquí se muestran algunas preguntas por considerar:

(Si necesitas recordar algunos de estos puntos, he incluido el número de página relevante del libro electrónico “RealityCharting: Seven Steps to Effective Problem-Solving and Strategies for Personal Success” por Dean L. Gano.)

  • ¿Las causas pasan la prueba de sustantivo y verbo? (página 83)
  • ¿Las causas tienen demasiadas palabras o descripciones innecesarias?
  • ¿Los elementos causales pasan todas las pruebas de lógica? (página 108)
    • Verificación lógica de tiempo-espacio
      • ¿Las causas de este efecto existen al mismo tiempo?
      • ¿Las causas de este efecto existen en el mismo lugar?
    • Verificación lógica causal
      • Si remueves la causa, ¿el efecto sigue existiendo?Si la respuesta es no, entonces la causa es necesaria para la relación causal y deberá permanecer en el gráfico. Si la respuesta es sí, deberá ser removida o reposicionada.
  • ¿Hay alguna violación a la regla? De ser así, ¿cuáles son y acaso pasan el estándar mínimo? Reglas por incluir:
    • ¿Alguna de las cajas de causas están vacías?
    • ¿Existen causas sin conexión en el grafico?
    • ¿Cada causa ha sido identifica como una acción o una condición?
    • ¿Cada efecto satisface el Segundo principio (causas existen en un continuo infinito, hay una acción y una condición para cada efecto)?
    • ¿Han sido eliminadas todas las conjunciones? Recuerda que “y” es a menudo interpretado como “causado,” que lleva a la mala interpretación y al error. (paginas 67-68)
    • ¿Cada causa tiene la evidencia adecuada que la respalde para justificas su inclusión en el grafico?
    • ¿Cada rama tiene un identificado un alto? Abajo se muestran altos potenciales: (paginas 88-89)
      • Signo de Interrogación – se necesita más información; una Acción es creada.
      • Condición Deseada – no se necesita seguir preguntando el porqué.
      • Falta de Control – algo de lo cual usted o su organización no tienen control, por ejemplo, “las leyes de la física.”
      • Nuevo Efecto Primario – un análisis separado es requerido.
      • Otros Caminos de Causa son Más Productivos – continuar por este camino sería una pérdida de tiempo.
  • ¿La matriz de solución recae en una caso típico?, tal como:

¿Las soluciones han sido comparadas con algún criterio estándar, con un rango estándar, para minimizar la posibilidad de que soluciones favoritas sean elegidas? (página 118-120)

  • ¿Cada solución ha sido asignada a un miembro del equipo y otorgada un plazo para finalizarlo?
  • ¿El grafico cumple con los cuatro principios de la causalidad? (página 36)
    • Las causas y los efectos son la misma cosa.
    • Las causas existen en un continuo infinito.
    • Cada efecto tiene al menos dos causas causes en la forma de acciones y condiciones.
    • Un efecto existe solamente si sus causas existen en el mismo tiempo y espacio.
  • ¿La definición del problema establece un valor financiero claro y significativo que permitirá a la gerencia hacer decisiones y aprobaciones adecuadas?
    • ¿Si el valor financiero no es apropiado (por seguridad o por una fatalidad potencial), entonces, la definición del problema requiere algún otro valor significativo?
  • ¿Todos los elementos de acción han sido resueltos? (Elementos de acción pueden incluir áreas en donde más información es requerida, hay problemas de evidencia, o algún elemento que ha sido incluido manualmente que necesita ser resuelto y borrado).

El próximo paso en desarrollar una auditoria, es generar una lista en la cual su RCA será comparado.

Este listado puede provenir de los elementos de arriba, su propia lista, o alguna combinación de las dos. Una vez tengas una lista de artículos que auditar, necesitaras crear una escala de calificaciones. Esto puede ser una situación de aprobado/reprobado o una escala que califica del 0 al 5 a cada artículo. Esta última le puede permitir dar crédito parcial a algunos elementos que no satisfacen por completo el estándar.

Desarrolla una hoja de calificación para cada artículo y califica cada uno. No olvides dejar espacio para notas para que el auditor pueda explicar sus razones por otorgar crédito parcial. Es útil agregar una directriz para cada artículo, para que el auditor tenga un criterio de como calificarlo. Un ejemplo de una directriz puede verse así:

0 = No existe
1 = Algunas están en su lugar, pero no son correctas
2 = Varias están en su lugar y algunas con correctas
3 = Todas están en su lugar, pero solo algunas son correctas
4 = Todas están en su lugar y la mayoría son correctas
5 = Todas están en su lugar y todas son correctas

Con directrices como éstas fácilmente disponibles como una referencia en su hoja de calificación, ayuda a garantizar la coherencia en el marcador, sobre todo si hay varias personas que estarán anotando en un RCA.

Ahora todo lo que tienes que hacer es revisar su RCA en comparación con su lista, calificarla, y definir algún mínimo de aprobado.

Esto asegurara que cada RCA sea comparado con un estándar consistente que pueda ser repetido por varias personas, a pesar de que siempre vaya a haber diferencia si múltiples personas están auditando los RCAs. Las diferencias pueden ser minimizadas por tener solo una persona llevando acabo la auditoria o calibrando la auditoria, o trayendo al personal junto y calificar varios como grupo tal de que todos los auditores entiendan el matiz de calificar.

Mientras que he proporcionado una lista bastante completa, de lo que se requiere para comprobar,  al momento de auditar un RCA, mi experiencia es que un RCA puede cumplir con todos los requisitos anteriores y todavía tener algunos problemas. El problema mayor es que la lógica puede ser correcta, pero las causas no, por lo que el RCA puede pasar las pruebas pero no será lo que realmente solucione el problema. El hecho es que los seres humanos están involucrados y que cometemos errores. A veces los errores pueden ser causados ​​por los investigadores inexpertos que necesitan más práctica. Otras razones para error son algunos de los filtros que hablamos, por ejemplo, las limitaciones de tiempo, las nociones preconcebidas o prejuicios, problemas de lenguaje, etc. Esto significa que todavía hay un componente que necesita ser revisado por alguien para la integridad general y para las cosas que una computadora no puede buscar. Esta persona puede ser una persona externa del corporativo, un contratista o un recurso interno.

RealityCharting® tiene las herramientas disponibles para el revisor para ayudarle a criticar el análisis tales como comprobar las reglas, el informe de elementos de acción , vista del elemento causal, y lo más importante contiene un tablero de interactivo.

 Click on the infographic for a PDF version. 



Author: Jack Jager

When it comes to problems with quality in your operation, there are the obvious red flags—unhappy clients, defective products, poor reputation, delays, and exorbitant costs, to name a few. But there are other more subtle signs that your quality control department has room for improvement.

Your QC Department Looks Like a Firehouse

Those of us who work in quality control can easily fall into the pattern of fire fighting—running from one issue to the next, solving each problem in the near-term as it crops up. This can work okay for a time, but it’s not a great long-term strategy. When you only focus on solutions and never get down to the root causes that are creating your issues, you will find that the same types of issues keep occurring. “An ounce of prevention is worth a pound of cure” should be the mantra of every QC department. It’s worth the extra time up front to get at the root causes of an issue.

Your Quality Folks Aren’t Talking Cents

The universal language of business is dollars and cents, so if your quality control department isn’t translating your issues into actual cost to the business, they might not be heard. For example, you might calculate the cost of the time it takes to close different types of exceptions and add that information to your efficiency evaluations.

There Is a Veil Over the QC Department

Sometimes the quality department is treated differently than manufacturing, engineering, or facilities when it comes to accountability. But it’s very important that QC personnel and their equipment are held to certain standards, too. While QC is often responsible for finding solutions, they also need to be held responsible for their share of the causes—for instance, the impact to the supply chain if raw materials or final product testing is not completed effectively. If there has never been an evaluation of your QC department’s process, it’s definitely time to QC your QC.

Your QC Department Sits in an Ivory Tower

Quality folks can do a much better job if they receive training in other areas, including manufacturing, validation, and project management. When a quality person is too specialized, it can prevent them from seeing the whole picture and finding more comprehensive solutions. If your QC department tends to be resistant to change, that might be a sign that it’s time to expand their horizons with some additional training outside their primary field of expertise.

Anything Short of Total Failure Is Considered Success

Let’s say you work for a chemical plant that manufactures plastic bags. You make a polymer that requires water, but the water you’re using has a bad bacteria in it. There is a corporate requirement that the water be clean, so the bacteria is a problem. However, the finished material passes the test even though there was a deviation earlier in the manufacturing process. So is it really a problem after all? If your client sees a pattern of failure within your process, they will begin to believe that you aren’t truly concerned with quality, even if the final product technically meets the specifications. Make sure that you’re taking all issues seriously, even if they don’t seem to affect the final outcome at first glance.

If any of these scenarios sound familiar, download our eBook “11 Problems With Your RCA Process and How to Fix Them” in which we provide best practice advice on using Apollo Root Cause Analysis to help eliminate problems in your QC process and beyond.

Author: Kevin Stewart

I recently wrote an article about auditing root cause analysis (RCA) investigations, and it only seemed appropriate to follow up with advice on auditing your overall RCA program. Let’s go back to the dictionary definition of “audit” — a methodical examination and review. In my mind this definition has two parts: 1) the methodical examination and 2) the review.

It might help to compare this process to a medical examination. In that case, the doctor would examine the patient, trying to find anything he can, either good or bad. This would include blood work, reflex test, blood pressure, etc. After that examination, he would then review his findings against some standard to help him determine if any action should be taken. Auditing an RCA program is no different; first, we must examine it and find out as much as we can about it, then we will need to review it against some standard or measure.

In my other article, I discussed at length the measures against which an RCA investigation could be judged. Those still apply, and one of the program audit items can and should be the quality of the RCA investigation.

Now we are faced with determining the characteristics of a good program. A list of characteristics is given below:

  • Quality of RCA investigations
  • Trigger Levels are set and adhered to
  • Triggers are reviewed on a regular basis and adjusted as required to drive improvement
  • A program champion has been designated, trained and is functioning
  • Upper management has been trained and provides invloved sponsorship of the program
  • Middle management has been trained and provides involved sponsorship of the program
  • The floor employees have been trained and are involved in the process
  • The solutions are implemented and tracked for completion
  • The RCA effectiveness is tracked via looking for repeat incidents
  • Dedicated investigators / facilitators are in place
    • Investigators are qualified and certified on an ongoing basis
  • All program characteristics are reviewed / defined / agreed to by management and include: An audit system is defined, funded, and adhered to
    • Resource requirements
    • Triggers
    • Training requirements are in place and funded
    • Sponsorship statements and support
  • The RCA program is incorporated into the onboarding and continuous review training for new and existing employees

The next step in developing an audit is to generate a set of items that your program will be gauged against. This list can come from the items above, your own list, or a combination of the two. Once you have a final list of items to audit against, you need to generate a ratings scale. This can be a pass/fail situation or a scale that gives a rating from 0 to 5 for each item. This can allow you to give partial credit for some items that may not quite meet the full standard. You can also provide a weighting scale if deemed appropriate. This would mean that some of the items in the list had more importance or weight in the scoring based on the local feelings or culture of your facility. This scale can be anything you wish, but be cautious about making the scale too large. Can you really tell the difference between a 7 or 8 in a 10-point scale? Perhaps a 1 – 4 scale would be better?

Next, develop a score sheet with each item listed and a place to put a score for each one. It’s handy to add some guidelines with each item to give the reviewer a gauge on how to score the item. A sample of such guidelines might look like:

0    Does not exist

1    Some are in place but not correct

2    Many are in place and some are correct

3    All are in place but only some are correct

4    All are in place and most are correct

5    All are in place and correct

Don’t forget to leave a space for notes from the reviewer to explain the reasons for partial credit. With this in place either next to each item or easily available as a reference, it helps ensure consistency in the scoring, especially if multiple people will be scoring your RCA program.

The goal for a standardized audit process would be that several different people could independently review and score a program and would come up with essentially the same score. This may seem like a simple thing, but it turns out to be the largest issue because everyone interprets the questions slightly differently. There are several things you can do to minimize discrepancies:

  1. Provide the information above to help.
  2. Require the auditors to be trained and certified by the same process/people and then have them provide a sample audit and check it against the standard. Review and adjust any discrepancies until you are sure they will apply the same thinking against the real audit.
  3. Always ensure that if multiple auditors are used in a program review, at least one has significant experience to provide continuity. In other words, don’t allow an audit to be done with all first-time auditors.

With these measures in place, all you have to do is review the RCA program against your list, score it, and have some sort of minimum for passing. Likewise, you’ll want to have some sort of findings report where the auditor can provide improvement opportunities against the individual items instead of simply saying: “did not pass.”

These measures will ensure that the program is gauged against a consistent standard and can be repeated by multiple auditors. There will always be differences if multiple people are auditing an RCA program, but by utilizing the steps above these differences can be minimized to provide the highest level of credibility for the audit.