Category Archives: Root Cause Analysis

RCA201_At A Glance.jpg

training_footer_ad1-resized-600.jpg

In our previous blog article, “RCA Program Development: The Key Steps of Designing Your Program”, we provided a high-level outline of the eleven key elements that need to be defined in order to have an effective root cause analysis program in your organization. Now, in a series of articles, we’ll break down each of those eleven key elements into further detail expanding on the important considerations that need to be taken into account, starting here with your goals and your current status. bigstock-Group-of-three-successful-busi-68707195.jpg

  1. RCA Goals and Objective Alignment

So, the first question is, “What are we trying to achieve with the RCA effort?” The answer from an overarching perspective can be found in the organization’s goals and objectives. Every organization, and individual for that matter, has a set of goals and objectives that are used as yardsticks to measure both short and long term performance. It is critically important that the RCA effort be in complete alignment with the organization’s and individuals’ goals and objectives. We do this by using the goals and objectives to guide us in identifying the Key Performance Indicators (KPIs) of the RCA effort and setting the Threshold Criteria (Trigger Diagram) for determining when a formal RCA must be performed. (Learn more about setting Threshold Criteria.) If the alignment is true, then there will be tangible, measurable improvement in goals and objectives achievement over time.

  1. Status of Current RCA Effort

Every organization will have some form of RCA in practice whether it is formalized, or ad hoc. It is worthwhile spending some time assessing the status, or maturity level, of the existing RCA process. Maturity level can be categorized in one of four general categories.

  • Level 1: Learning and Development
  • Level 2: Efficient
  • Level 3: Self-Actualizing
  • Level 4: Pro-Active

Level 1, Learning and Development, is where most organizations without a formalized RCA program find themselves.  Management recognizes a need for a formal problem solving method but the focus is primarily on training. There is little or no structure in place to support the trained facilitators and no well-defined KPIs or threshold criteria guidance. At this stage the organization will usually gain some organizational improvements from the elimination of problems, but in an inefficient “learn as you go” manner.

At Level 2, Efficient, formal RCA triggers and KPIs are in place and are aligned with business goals and objectives in advance of RCA training. This would include clear definitions of RCA roles and responsibilities as well as identification of supporting infrastructure such as RCA status tracking, effectiveness of implemented corrective actions and the like.

In the Self-Actualizing level, the effectiveness of the trained problem-solvers has matured through experience. Thus, their ability to solve organizational problems has resulted in a documented achievement of the program KPIs and resulting improvements to the organization’s goals and objectives. The organization is now in the continuous process of tightening the bandwidth of the KPIs to yield greater return to the bottom line. The RCA facilitators are now highly confident, efficient, and effective in eliminating impediments to achieving goals and objectives.

In the Pro-Active level, your organization has now integrated the RCA process into its core culture. Effective problem elimination is the norm and expected at all levels of the organization. People no longer look to place blame for problems but instead are focused on prevention and elimination. Return on investment for both monetary as well as health/safety/environmental issues is extremely high acting both as gratification and motivation. RCA has become a core competency within your culture whereby people are intolerant of ineffectively solving problems the first time and are finding pro-active ways to use RCA to prevent problems from occurring in the first place.

There are existing methods or surveys that can be used to determine an organization’s maturity level. Why is this important? Determining your current maturity level draws a line in the sand showing where you started, or took a renewed focus, in this journey of developing your RCA program. You can set goals around where you want to be over a period of time and look back to see how your program has actually evolved.

This article has given you a glimpse into the first two key elements but of course, there is more to setting up your RCA program for success. ARMS Reliability’s RCA experts can assist you with designing your complete RCA program or reinvigorating your current one. This of course includes assisting with determining the status of your current RCA effort and walking you through the process of establishing and aligning goals. Learn more about our recommended facilitated workshop and contact us for more information.

RCA Progam Development Banner.jpg

bigstock-Construction-Worker-Falling-Of-68401633_Filters.jpgWithout truly understanding the key elements (and possessing the necessary skills) to conduct a thorough, effective investigation, people run the risk of missing key causal factors of an incident while conducting the actual analysis. This could potentially result in not identifying all possible solutions including those that may be more cost effective, easier to implement, or more effective at preventing recurrence.

Here we outline the 5 key steps of an incident investigation which precede the actual analysis.

1. Secure the incident scene

  • Identify and preserve potential evidence
  • Control access to the scene
  • Document the scene using your ‘Incident Response Template’ (Do you have one?)

2. Select investigation team

  • The functions that must be filled are:
  • Incident Investigation Lead
  • Evidence Gatherer
  • Evidence Preservation Coordinator
  • Communications Coordinator
  • Interview Coordinator

Other important considerations for the selection of team members include:

  • Ensure team members have the desirable traits (What are they?)
  • The nature of the incident (How does this impact team selection?)
  • Choose the right people from inside and outside the organization (How do you decide?)
  • Appropriate size of the team (What is the optimum team size?)

*Our Incident Investigator training course examines each of these considerations and more, giving you the knowledge to select investigation team members wisely.

3. Plan the investigation

Upon receiving the initial call:

  • Get the preliminary What, When, Where, and Significance
  • Determine the status of the incident
  • Understand any sensitivities
  • If necessary and appropriate, issue a request to isolate the incident area
  • Escalate notifications as appropriate

The preliminary briefing:

  • Investigation Lead to present a preliminary briefing to the investigating team
  • Prepare a team investigation plan

4. Collect the facts supported by evidence

Tips:

  • Be prepared and ready to lead or participate in an investigation at all times to ensure timeliness and thoroughness.
  • Have your “Go Bag” ready with useful items to help you secure the scene, take photographs, document the details of the scene and collect physical evidence.
  • Collect as much information as possible…analyze later
  • Inspect the incident scene
  • Gather facts and evidence
  • Conduct interviews

*While every step in the Incident Prevention Process is crucial, step 4 requires a particularly distinct set of skills. A lot of time in our Incident Investigator training course is dedicated to learning the techniques and skills required to get this step done right.

5. Establish a timeline

This can be the quickest way to group information from many sources

Tip:

  • Stickers can be used on poster paper to start rearranging information on a timeline. Use different colors for precise data versus imprecise, and list the source of the information on each note.

After steps 1-5 comes the Root Cause Analysis of the incident, solution implementation and tracking, and reporting back to the organization:

6. Determine the root causes of the incident

7. Identify and recommend solutions to prevent recurrence of similar incidents

8. Implement the solutions

9. Track effectiveness of solutions

10. Communicate findings throughout the organization

*Steps 6-10 are taught in detail at our Root Cause Analysis Facilitator training course.

To learn more on the difference between our Incident Investigator versus RCA Facilitator training courses, check out our previous blog article and of course, if you would like to discuss how to implement or improve your organization’s incident prevention process, please contact us.

bigstock--136958450.jpgAuthor: Bruce Ballinger

To have a successful implementation and adoption of your new RCA program, it’s crucial to have all the elements of an effective and efficient program clearly identified and agreed upon in advance.

Here’s a high-level look at the elements that will need to be defined:

RCA Goals and Objective Alignment

Define the goals and objectives of the program and assure they are in alignment with corporate/facility/department goals and objectives

Status of Current RCA Effort

Perform a maturity assessment of existing RCA program to be used as a baseline to measure future improvements

Key Performance Indicators

Identify KPI’s with baselines and future targets to be used for measuring progress towards meeting program goals and objectives

Formal RCA Threshold Criteria

Determine which incidents will trigger a formal RCA and estimate how many triggered events may occur in the upcoming year

RCA and Solution Tracking Systems

Identify which internal tracking systems will be used to track status/progress of open RCA’s and implemented solutions

Roles and Responsibilities

Identify specifically who will have a role in the RCA effort including, program sponsor, champion, RCA facilitators

Training Strategy

Determine who will be trained in the chosen RCA methodology and to what level and in what time frame

RCA Effort Oversight and Management

Identify who (or what committees or groups) will be responsible for managing tracking systems, decisions on solution implementation, program modifications over time, and general program performance

Process Mapping

Process mapping exercise to document RCA management from the beginning of a triggered incident to completion of implemented solutions, including their impact on organization’s goals and objectives.

Human Change Management Plan

Develop a Change Management plan, including a detailed communication plan, that specifically targets those whose job duties will be affected by the RCA effort.

Implementation Tracking

Create a checklist to monitor RCA effort implementation including action items, responsible parties and due dates

We recommend conducting a workshop in order to define each of these crucial elements of your RCA program.

The workshop should be conducted for what we call a “functional unit” which ideally is no larger than a plant or facility, however, it can be modified to accommodate multiple facilities.

Common elements of a functional unit include:

  • A common trigger diagram
  • Common KPI’s
  • The same Program Champion
  • Members have an interdependence and shared responsibility on one another for functional unit performance

By structuring programs to fit within the goals and objectives of the business, or “functional unit”, rather than applying a ‘one size fits all’ solution, effective and long lasting results can be realized.

Implementing a new RCA program or need to reinvigorate your current one? ARMS can help you create a customized plan for its successful adoption. Contact Us for more information

Author: David Wilbur, CEO – Vetergy Group

To begin we must draw the distinction between error and failure. Error describes something that is not correct or a mistake; operationally this would be a wrong decision or action. Failure is the lack of success; operationally this is a measurable output where objectives were not met. Failures audit our operational performance, unfortunately quite often with catastrophic consequences; irredeemable financial impact, loss of equipment, irreversible environmental impact or loss of life. Failure occurs when an unrecognized and uninterrupted error becomes an incident that disrupts operations. bigstock-Worker-in-factor-1108477eac4c3d0b3c37f374ad197440e9c5b429

Individual-Centered Approach

The traditional approach to achieving reliable human performance centers on individuals and the elimination of error and waste. Human error is the basis of study with the belief that in order to prevent failures we must eliminate human error or the potential for it. Systems are designed to create predictability and reliability through skills training, equipment design, automation, supervision and process controls.

The fundamental assumptions are that people are erratic and unpredictable, that highly trained and experienced operators do not make mistakes and that tightly coupled complex systems with prescribed operations will keep performance within acceptable tolerances to eliminate error and create safety and viability.

This approach can only produce a limited return on investment. As a result, many organizations experience a plateau in performance and seek enhanced methods to improve and close gaps in performance.

An Alternative Philosophy

Error is embraced rather than evaded; sources of error are minimized and programs focus on recognition of error in order to disturb the pathway of error to becoming failure. 

Slight exception notwithstanding, we must understand people do not set out to cause failure, rather their desire is to succeed. People are a component of an integrated, multi-dimensional operating framework. In fact, human beings are the spring of resiliency in operations. Operators have an irreplaceable capacity to recognize and correct for error and adapt to changes in operating conditions, design variances and unanticipated circumstances.

In this approach, human error is accepted as ubiquitous and cannot be categorically eliminated through engineering, automation or process controls. Error is embraced as a system product rather than an obstacle; sources of error are minimized and programs focus on recognition of error in order to disturb its pathway to becoming failure. System complexity does not assure safety. While system safety components mitigate risk, as systems become more complex, error becomes obscure and difficult to recognize and manage.

Concentrating on individuals creates a culture of protectionism and blame, which worsens the obscurity of error. A better philosophy distributes accountability for variance and promotes a culture of transparency, problem solving and improvement. Leading this shift can only begin at the organizational level through leadership and example.

The Operational Juncture™

In contrast to the individual-centered view, a better approach to creating Operational Resilience is formed around the smallest unit of Human Factors Analysis called the Operational Juncture™. The Operational Juncture describes the concurrence of people given a task to operate tools and equipment guided by conflicting objectives within an operational setting including physical, technological, and regulatory pressures provided with information where choices are made that lead to outcomes, both desirable and undesirable.

It is within this multidimensional concurrence we can influence the reliability of human performance. Understanding this concurrence directs us away from blaming individuals and towards determining why the system responded the way it did in order to modify the structure. Starting at this juncture, we can preemptively design operational systems and reactively probe causes of failure. We view a holistic assignment of accountability fixing away from merely the actions of individuals towards all of the components that make up the Operational Juncture. This is not a wholesale change in the way safety systems function, but an enhanced viewpoint that captures deeper, more meaningful and more effective ways to generate profitable and safe operations.

A practical approach to analyzing human factors in designing and evaluating performance creates both reliability and resilience. Reliability is achieved by exposing system weaknesses and vulnerabilities that can be corrected to enhance reliability in future and adjacent operations. Resilience emerges when we expose and correct deep organizational philosophy and behaviors.

Resilience is born in the organizational culture where individuals feel supported and regarded. Teams operate with deep ownership of organizational values, recognize and respect the tension between productivity and protection, and seek to make right choices. Communication occurs with trust and transparency. Leadership respects and gives careful attention to insight and observation from all levels of the organization. In this culture, people will self-assess, teams will synergize and cooperate to develop new and creative solutions when unanticipated circumstances arise. Individuals will hold each other accountable.

Safety within Operational Resilience is something an organization does, not something that is created or attained. A successful program will deliver a top-down institutionalization of culture that produces a bottom-up emergence of resilience.

HFWebBanner.jpg

These days, many enterprise-level organizations are likely to have similar operations in multiple locations regionally or even worldwide. When a piece of equipment fails or a safety incident occurs at one site, the company investigates the problem and identifies solutions or corrective actions. Naturally, the team wants to capture the lessons learned and share them with other sites that have similar equipment, processes and potential incidents. investigation files.jpg

Advanced tools like the RealityCharting® software enable teams to share results of an Apollo Root Cause Analysis (RCA) across multiple layers of stakeholders. However, a large multinational enterprise might have dozens of different investigations going on at any given time. At the highest levels, decision-makers don’t necessarily want to see granular information about specific causes at any given plant. They need a top-down perspective of problems and patterns that are affecting the entire organization.

At ARMS Reliability, many of our clients have expressed a similar need. Our solution? Using classification tags to create and apply a consistent taxonomy to all root cause analyses performed for a given organization. Rolled up into a composite report, these tags reveal enterprise-wide trends and issues, allowing management to create action plans for tackling these systemic issues. For example, classification tags might uncover a large number of problems related to a lack of preventative maintenance on a certain type of pump, or a systemic non-compliance with a required safety process.

A classification taxonomy can be scalable and configured to an organization’s goals and processes. Think of these classifications like buckets that can be applied at any level of the RCA — e.g., to the root causes or solutions, to individual contributing causes, or simply to the RCA investigation in general.

Keep in mind: The Apollo Root Cause Analysis method is centered around a free-thinking approach to solving problems. That’s what makes the methodology so powerful — it doesn’t lead you down any generic predetermined pathways by asking leading questions or categorizing various causes or effects in any way. At ARMS Reliability, we advocate applying classification tags only after the root cause analysis investigation is completed, so you keep the free-thinking causal analysis and organize it later, for the purpose of rolling the findings up into a deeper systemic view.

Taxonomies can range from 5–20 categories into the hundreds. For example, here we’ve used a human factors taxonomy to tag causes as organizational influences and other people-centric issues.

screenshot 1.png

 (Click to enlarge)

Reports can provide a summary of how many causes were classified under the various tags:

screenshot 2v2.jpg

 (Click to enlarge)

In another example, an organization bases its taxonomy of reliability issues on the ISO 14224 – Collection and exchange of reliability and maintenance data for equipment.

 

screenshot 3v3.jpg

 (Click to enlarge)

The taxonomy options are endless. Most organizations we work with have their own unique systems of classifications. It’s really all about codifying the types of information your organization most needs to capture.

If adding classifications to your Root Cause Analyses would be useful for your organization, contact ARMS Reliability. We’d be glad to show you more about what we’re doing with other clients and help you develop a taxonomy that works best for your needs.

One of the four basic principles in the Apollo Root Cause Analysis methodology is that for each effect there are at least two causes and these causes are either actions or conditions. bigstock--133288028_BWCropped.jpg

This principle causes you to think more critically, challenge causal relationships more consistently, and to understand that things are rarely as simple as they may seem.

One implication of this principle is that there should never be a straight line, or even a partial straight line of causes within a cause and effect chart. A straight line tells us that there are other causes that still need to be found or identified, and more questions must be asked.

In each causal connection we should see at least one action cause and one condition cause.

So what are actions and conditions?

Conditions exist—they refer to the current state of things. Take gravity for instance—it is there all the time. Gravity exists. So this cause would be a conditional cause.

Conditions must exist. They always exist alongside of any action.

An action cause is a cause which makes use of the available conditions. If the conditions didn’t exist, then the action would have no effect at all. The action cause is that moment in time when something happens. It is the thing that is different—the instigator or the catalyst of the effect that occurs.

Typically, there is one action and several conditions. Many of the action causes are also related to the things that people do. Action causes are readily seen and tend to be easily identified. When people tell the story of what happened they often list a series of actions, and relatively few conditions.  When we create a timeline or sequence of events, the initial straight line will be constructed mostly of actions. image_1.png

The Apollo Root Cause Analysis methodology demands an exhaustive search for both condition causes and action causes. If you only see half of the problem, will you really understand it? If you only find half of the causes, you will also only have half of the opportunities for controlling or mitigating the problem to an acceptable level.

Let’s take a look at an example – “An Object Fell Off a Platform”

“What happened to make the object fall?” would be a good question to ask. Let’s say someone kicked it off the platform. This is the direct cause of why the object fell, so this is considered an action cause. It is the ‘something’ that happened.  An action cause will typically be described using a noun/verb connection as in ‘object /kicked.’

But it’s not always that simple. There are other causes that have played a role in this scenario.  At this point in time it is important to challenge the concept of the linear connection of causes and keep searching for more.

The “Every Time Statement”

A useful tool to apply in this scenario is an “Every Time Statement.” The statement itself should be absolute in the sense that all causes in the connection need to be present. The same effect should happen each time the action occurs.

So, every time you kick the object off the platform it will fall? No, not every time.

Why not?  Because, the object in question must be elevated. If you kick it while it is on the ground it will not fall.

So is this an action cause or a condition cause?

It is a state of where the object was at the time it was kicked. So in this instance this cause would be labelled as a condition.

Now that another cause has been identified, you can repeat the “Every Time Statement.”

Every time you kick the object off the platform and the platform is elevated, the object will fall. Every time?  Well, it will only be true if there is gravity in play. If there is no gravity present, then this statement will not be true.

Is gravity an action or a condition? It’s not an event, it just exists. It was there when the problem occurred. This means that we would label this cause as a condition.

There are now three causes in this causal relationship, but have we identified every cause in that causal connection? At this point we have:

  1. Kicked object off platform
  2. The platform was elevated
  3. Gravity was present

Will the object fall every time?  Only if the object has a mass which is greater than that of air. If it were lighter than air, then it would not fall.

Is this cause an action or a condition? Again we observe that the object’s mass didn’t change. Its mass was what it was before the incident and had been so for some time. This makes this cause a condition.

Encourage people in the RCA group to actively look for the exception that makes a lie out of the “Every Time Statement.”  Every time you find an exception to this statement you have effectively identified another cause. Add it to your list of causes and repeat the “Every Time Statement.” When you can’t identify any other exceptions then you should have effectively identified every cause in that causal connection. The statement should now be absolute.

So what we have identified here is that there are at least four causes in this causal relationship that will influence whether an object will fall or not. In fact, every time something falls the same types of causes will be in play. The action cause will still need to occur, but this may come in different forms. The action can be different but it will still make use of the available conditions.

image_2.png

To Sum Things Up

It is valuable to be able to label causes as either actions or conditions. The process of labelling causes demands that you find multiple causes for each connection. This in itself will challenge your understanding of the problem.

Understanding what the conditional causes are will also lead you to finding the most effective solutions for your problem – the hard controls. By actively engaging in challenging the logic of each and every connection within the cause and effect chart consistently, many more conditional causes will be found and more options of control will present themselves. When you have the ability to eliminate a conditional cause, substitute it, or engineer it out, then your solutions and their outcomes will be more consistent, reliable, and predictable. You can therefore, with a fair degree of certainty, declare that the problem will not recur.

publictraining.png

Many of us have them. The invisible “graveyard” where good intentions (AKA – corrective actions from your root cause analysis investigation) went to die.

How do they end up there? bigstock-Spooky-old-graveyard-at-night-71555167.jpg

We all know that all the time and money spent on a root cause analysis investigation and identifying solutions are worthless if the solutions are not implemented. An investigation can usually be done within a week but solutions can take much longer to implement. They sometimes require the involvement of multiple teams or departments, regulatory agencies, engineering, planning, budgeting, and the list goes on and on. For these reasons, it can be challenging to stay on top of all the corrective actions you identified in your investigation, who’s responsible, and the status of an action item at any given time.

We can offer a few basic tips that will give you a head start in tracking action items effectively:

  • Be clear about who is responsible for each corrective action. You don’t want to create the opportunity for people to be able to pass the buck with “I thought Bob was going to do it”.
  • Have a mechanism in place by which the implementation of corrective actions can be tracked.
  • Give ownership of a solution to an individual, not a group or department.
  • Assign a due-date for each corrective action.
  • Support people in their efforts to implement corrective actions.
  • Make sure you follow up on each corrective action – check back with the individual responsible to make sure that progress is being made.

But even these “basics” are easier said than done.

In reality, most likely you come out of your root cause analysis investigation with a list of action items for which various people are responsible. Then everyone goes about their regular workdays and may or may not remember to follow through on any additional tasks they were assigned. Even if you have an appointed person to follow up with the action items and make sure they’re on track, it can be difficult to keep up with who has done what. Many managers rely on an Excel spreadsheet to manually track what has and hasn’t been done, due dates, and so forth. But this puts a lot of pressure on one person to keep up with everything – to manually send reminders to folks who haven’t completed their tasks and to enter the information properly when it has been done.

Even when the Excel file has been carefully kept up-to-date, it often lives locally on the manager’s hard drive, and other members of the team don’t have any visibility as to what has and hasn’t been done.

Sound familiar?

If your RCA program is starting to mature it may be time to consider an enterprise solution to help you better manage all your investigations.

Corrective action tracking inside of an enterprise RCA tool can help you maintain visibility and accountability by tracking the status of action items and assigned solutions. Team members get sent automatic reminders of incomplete or overdue action items and they can easily update the status of their assigned tasks, instantly informing everyone when a task has been completed. You can also create personalized dashboards with reports showing open, completed, or overdue corrective actions.

austin_plant_generator_backfire.png

In addition to effective action tracking, an enterprise RCA solution can more broadly help your company implement and manage an effective overall root cause analysis program.

Here are some of the main features to look for:

  • Enterprise-wide visibility of your RCA program
    • Expand the RCA knowledge base and accessibility across an organization.

 

  • Search across the database for past RCAs, solutions, causes, equipment items, etc
    • Leverage information from previous investigations in your current investigation.
  • Classify problem-types by company or industry standards or by a pre-set list
    • Classify and tag files for easy search-ability. Create custom tags incorporating company or industry standards.
  • Create and share interactive KPI reports
    • Build reports on your chosen metrics and visually display key performance indicators in tables, charts and graphics.
  • Create personalized dashboards
    • Specify which reports are most important to you for immediate dashboard display on your homepage.
  • Save and embed reference files such as photos equipment failure data, interviews, etc
    • Preserve integrity by securely collecting and storing evidence and important reference files.
  • House internal company resource documents and tools
    • Store company corporate standards or reference files such as frequently referenced industry documents in a central location for immediate access when facilitating an RCA.
  • Progress updates
    •  Communicate with all users through on-page messaging that lets you quickly share information, receive feedback and record comments
Keeping your RCA investigation corrective actions out of the graveyard is a very common challenge in maturing RCA programs, but it’s just one of many. To see what you may be up against in the future, check out our free eBook, 7 Challenges to Implementing Root Cause Analysis Enterprise-Wide and How to Overcome ThemRemember, in order to resurrect your RCA investigation corrective actions, start with the basics that we listed at the beginning of this article. But also keep in mind – the more mature your RCA program becomes, or the larger and more complex your organization, the larger and more complex your problems become. So when you’re ready to alleviate this pain point altogether, consider whether an enterprise RCA solution might be the next step in your program’s development.
EnterpriseSolution20MinDe-93cf6cab126e71660e54828c02c4cdb73d681dfd

 Click on the infographic for a PDF version. 

Practical_Tips_Thumbnail_Image.jpgPracticalTips-PreparingToLeadAnRCAInvestigation_FINAL_2.jpg

ApolloRCAMethod_BlogBanner_600x285.jpg

3SimpleTips_Banner_600x286_V2.jpg

“How long should an RCA take?”

This question is similar to how long is a piece of string?

I have heard one manager in a plant that has stipulated a maximum of two hours for an RCA to be conducted in his organisation. Another expects at least “brainstormed” solutions before the conclusion of day one – within 6 or 7 hours.  It is not uncommon for a draft report to be required within 48 hours of the RCA.

The following three tips may assist to meet tight deadlines and when time expectations are short. One advantage of the Apollo Root Cause Analysis methodology is that it is a fast process but the “driver” has to be on the ball to achieve the desired outcomes – effective solutions.

Tip #1 You Define The Problem

Imagine the RCA has been triggered by an unplanned incident or event which falls into any of the safety, environment, production, quality, equipment failure or similar categories. You have been appointed as the facilitator by a superior/manager who is responding to the particular event. Your superior/manager may understand the trigger mechanism and may well nominate the problem title.

For example, “upper arm laceration”, “ammonia spill”, “production delay” and so forth could be the offering you make to the team as the starting point for the analysis. Typically, as facilitator you will have gathered some of the “facts” from first responder reports, interviews, data sheets, photographs and so on.  So a good first step is to draft a problem definition statement, including the significance reflected by the consequences or impacts. The team then has a starting point to commence the analysis, albeit the problem statement may change as more detail is provided.

Ideally, you will have already created a file in RealityCharting™ and the Problem Definition table can be projected onto a screen or even onto the clear wall where your charting will be done with the Post-It™ notes. The team members’ information ought to have been entered and can be confirmed quickly in this display. You might even show the Incident Report format and focus on the disclaimer option you have selected deliberately: Purpose: To prevent recurrence, not place blame.

This preparatory work could save at least 20 minutes of the team members’ time and enable an immediate launch into the analysis phase. 

Important: Save yourself hours of re-work and potential embarrassment by saving the file as soon as this first process is complete, if you haven’t already done so, and thereafter on a regular basis. Maintain some form of version control so that the evolution of the chart in the following day/s can be tracked if necessary.

If you are particularly well-resourced the chart development might be recorded on the software simultaneously as the hard copy is created on the wall space. A small team might choose to create the chart directly via the software and a decent projection medium.

 Tip #2 Direct The Analysis 

It is critical that your initiative in preparing the problem definition is not considered by the team members as disenfranchising them. The analysis step whereby all have an opportunity to contribute should ensure that they feel they have “ownership” of the problem.

To reinforce this, it is advisable to choose a sequence of addressing each member, typically from left to right or vice-versa depending on the seating arrangements. This establishes the requirement that one person is speaking at a time, secondly, that each and every statement will be documented and thirdly, that every person has equal opportunity. Your prompt and verbatim recording of each piece of information will provide the discipline required to minimise idle chatter which can waste time because it distracts focus. When you have a series of “pass” comments from team members because the process has exhausted their immediate knowledge of events, launch the chart creation. 

It is worthwhile reminding the team that each information item that has been recorded and posted in the parking area, may not appear in their original form on the chart or at all, in some cases. Because the information gathering is a widespread net to capture as much knowledge regarding what happened, when and why, there will be no particular focus. But because they are coming from people with experience and expertise or initimate knowledge of events and
circumstances, they have some value. The precise value will be determined by where the information sits in the cause and effect logic that starts at the problem and is connected by “caused by” relationships. 

Important: Cause text should be written in CAPITAL LETTERS. It will be easier to read/decipher for the team at the time and perhaps from photographs of the chart later. Similarly using caps in the software itself means that projection of the chart is more effective and the printing of various views is enhanced.

 Tip #3 The “How and If” of Creating a RealityChart

Many proponents tap the existing understanding of the event by capturing as many of the action causes as possible. These may arrive via a 5 WHYS process, for example, which starts at the Primary Effect.

            Plant Stopped (Problem or Primary Effect)

            Why? Feed pump not pumping

            Why? Broken Coupling

            Why? Motor Bearing Seized

            Why? Bearing race Collapsed

            Why? Fatigue

The Apollo RCA method requires use of the expression “caused by?” to connect cause and effect relationships. Understanding that there must be at least one action and one condition helps  reveal the “hidden” causes and especially the condition causes which do not come to mind initially.

To support this expression and the essential “why”, consider asking “how”. This may be  employed initially by the most impartial member of your team who has been engaged specifically because of his/her lack of association with the problem and can sincerely ask the
supposedly “dumb” questions. Invariably these questions generate more causes or a more precise arrangement of the existing causes. A “How does that happen exactly?” question can drive the team to take the requisite “baby steps”.  This also often exposes differences between “experts” and the resolution of these differences is always illuminating.

The facilitator needs to be aware of the need to softly “challenge” the team’s understanding while ensuring the application of sufficient rigour to generate the best representation of causal relationships. This can be done in a neutral manner by using the “IF” proposition.

Given that every effect requires at least two causes, you can then address the team with the proposition: “If ‘one exists’ and ‘three exists’ (two conditions) then with ‘four added’ (the action) will the effect be “eight” every time?”. Using this technique on each causal element will generate the clarity and certainty being sought to understand the causes of the problem. If every “equation” (causal element) in the chart is “real” and the causes themselves are “real”
(substantiated by evidence) then the team is well-placed to consider the types of controls it could implement to prevent recurrence of the problem.

The more causes which are revealed the more opportunities the team has to identify possible solutions.

 Summary

To speed up the RCA process,

Step 1 Facilitator gathers event information and fills out Problem Definition Statement.

Step 2 Facilitator directs the Information gathering casting a wide net and systematically requests information from participants.

Step 3 Use information gathered to build a RealityChart™ with actions based on what happened then looking for other causes such as conditions which may initially be hidden. Use how and If to help validate that causal relationships are logical.

With a completed chart the solution finding step can begin.

training footer ad resized 600