Monday, June 3, 2013

Why is stability of the data so important?

Recently on a popular Six Sigma site the following question appeared:

 "I have a question if I have variable data ( that is not normally distributed) I then transferred it to Attribute data and worked out the DPMO from the opps/defects. If the DPMO is normally distributed can I carry on using stats such at t – tests etc. Or because it is originally attribute data I should use chi squared etc? Any advise appreciated."

From this question, you could run a three day workshop. My short attempt at an answer included:

"As others have said, stay with the continuous data. Before doing anything else put the data on an appropriate control chart and learn from the special causes. As Shewhart noted: things in nature are stable, man made processes are inherently unstable. I have taken this from Shewhart’s postulates. T test and other tests all rest on the assumption of IID; Independent and Identically Distributed. If there are special causes present these assumptions are violated and the tests are useless. Even though the “control chart” show up in DMAIC under C for many novices, it should be used early. Getting the process that produced the data stable is an achievement. It is also where the learning should start. Calculating DPMO, and other outcome measures can come later; after learning and some work. Best, Cliff"

Why the fixation on outcomes, calculating capability, DPMO and the like?  Without any knowledge about stability of the data such calculations are very misleading. In 1989, I sat in a workshop where Dr. W. Edwards Deming made the following comment, "It will take another 60 years before Shewhart's ideas are appreciated." At the time, I thought he was nuts. Control charts were everywhere. Then they disappeared. Now I see Deming as a prophet.

Historically, we are going through a period in improvement science that is not unlike the dark ages. We have people grasping for easy path and quick answers generated by the computer that might as well be "unmanned." Getting the process stable is an achievement! Our first move with statistical software should not be a normality check, but a check of the data to see if we have data that is stable and predictable. If we have such a state, then our quality, costs and productivity are predictable. Without this evidence, we are flying blind.