Monthly Archives: December 2015

You are browsing the site archives by month.

By Kevin Stewart

Many of the Apollo Root Cause Analysis methodology training instructors often get asked the same question – “how long should it take to do a Root Cause Analysis (RCA) investigation?”  This is a difficult question to answer due to the variables associated with each individual RCA.   It’s a lot like asking someone, “How long will the trip take?”  How do you begin to answer that? Some questions that come to mind are – to where? Or how will you be traveling? Or what route will you take?  Or will you be stopping anywhere? And so on.

If it is so variable, how can we even talk about whether an RCA should take several days or not?  There are two general paths in the utilization of the Apollo methodology, let’s call them “long” and “short.” Since this article is about RCAs not taking several days, let’s focus on the short one. bigstock-Calendar-And-Clock-Time-Circle-83476289.jpg

Most people envision the Apollo Root Cause Analysis methodology as a large group of people in a conference room for several days as a necessary means to finding a valid solution.  It is true that many RCA investigations do take four to five solid, eight-hour days to determine an appropriate solution, but these should be problems that have a large significance where information may not be readily available.

I always point out to my students that not only is it possible to do an Apollo Root Cause Analysis in a short time, but I have personally done several that took less than a day.  How?

The Apollo Root Cause Analyisis process involves a specific methodology of asking “why?” or “caused by ____?” and then identifying an appropriate answer, writing it down, and then asking “why” again.  You do this until you are stymied with no answers or reach a point where it doesn’t make sense to ask “why” anymore.  This process does not change regardless of the type or the size of the problem, or for any other reason.

Many of you may have heard of the “Five Whys” as an RCA process.  This was designed for small problems experienced by operators on the line at Toyota facilities.  These little RCAs were done in the moment by people involved in the incident.  If you’re familiar with both the Apollo Root Cause Analysis methodology and Five Whys process you may notice that they are very similar. Many times I point out to students that you can see several “Five Whys” branches inside any Apollo RCA chart. So it stands to reason that the Apollo Root Cause Analysis methodology can be used in a similar fashion to the Five Whys.

Here’s an example.  I was responsible for the reliability of a production area of a plant during my career.  It was not uncommon to find me walking around looking for problems, and during one such time I discovered some people working hard to unplug a jammed conveyor.  It was plugged with a 1,000-pound solid carbon block wedged in between some posts, and there was no good access to the block with a crane or other lifting device.  When they spotted me I got an earful; apparently this had been happening on a regular basis.  The specific frequency was unknown, but the emotion of the operator told me that it was at least once per shift.  I promised to fix it for him and he calmed down, they got the unit unplugged and back on line, and he went back to his job just downstream of the jam.

Since I promised to fix this, I decided to spend some time at the unit to see if I could observe what was causing the jam.

The Apollo Root Cause Analysis process went like this:


If you start the RCA chart in your mind, you quickly get to a dead end because no one could see why the jam had happened.  The operator in the area was busy doing his job, which required constant attention—pouring molten metal into a small cavity to “glue” a copper rod to the top of an anode.   This was done while the line was moving; he poured one about every 15 seconds so he really couldn’t be looking around.  There were not a lot of other spare personnel in the area that could spend the time looking, so I decided that was my job. bigstock-Man-in-a-safety-hat-taking-not-64077667_Resized.jpg

These blocks where pushed onto an automated system by a large pusher that had a paddle hanging down from a cylindrical steel piece with a bushing, since the paddle was designed to float.  It seemed pretty obvious that the pusher had something to do with it… but how?  After they started up the system, it worked like a charm just as designed, no glitches.  Intermittent problems are some of the hardest to fix because you need to be there when things go awry or gather data to identify the causes.

So there I was with one cause on my box – “Block jammed caused by ____?” I thought perhaps if I watched it I’d get lucky enough to catch the issue.  So I stood there, and stood there, and stood there for perhaps an hour. Nothing.  I didn’t want to leave quite yet but it did seem like a waste of time, so I decided to check out other items in the area.  I spent an hour or so away from the machine and then went back. Upon returning to the unit there didn’t seem to be anything obviously out of order.  However, something seemed different, though I couldn’t put my finger on it.

After spending another hour away and then coming back again, this time I noticed what appeared to be a difference: slight, but I was pretty sure it was happening.  One more hour away and then back and sure enough something was happening over a long period of time.

Now I just needed to verify my suspicions.  Believing I knew the cause, I figured I had enough time to go to lunch and do some more office work before returning to the unit to check my theory and gather evidence.  I was correct.

The cause of the issue was that the paddle was rotating counter-clockwise on the shaft ever so slightly with every push.  It was taking more than six hours for it to rotate enough to push on the corner of the block, shove it sideways off the conveyor, and cause the jam.  So my chart looked like this after about six to seven hours:

chart_1.pngAt this point I alerted everyone to the issue, and the maintenance personnel came over and safely moved the paddle back so the shift could finish.  Our facility had a swing shift crew that worked in the area after the production was done, so they were assigned the task of fixing the unit.

That evening they removed the unit, checked everything against the drawings and specifications, and found that the tolerance on the bushing was incorrect.  It was close, but the tolerance was tight enough that each push that was not exactly dead-on caused a slight twisting force, moving the paddle off course and eventually causing a jam. The team fixed the tolerance issue and put it back in place by the next shift start.

So my chart now looked like this:


This whole process took less than eight hours to complete but was spread out over two days.  If you look at my total time involvement it was perhaps four hours. (I am not charging the process with time that I was multitasking by doing other things.)

So as you can see, an RCA investigation doesn’t always have to take days.  Of course, some will take several days and you could stretch even a simple investigation into a longer process if you wish. But if you are close to the problem, get accurate information, act quickly, and stick with the process, you can do an RCA quickly and get an effective solution.