Decision making from analysis in Six Sigma is the most important part of any project. It’s great that you’ve collected lots of tasty data but if you don’t take the correct decisions as a result, it will greatly impact your Six Sigma initiative.
So, let’s think about data collection. Are you taking the population or a sample? A population is where you take all occurrences of data, so it could be all transactions on your website within the last 12 months. Sometimes, you either don’t have all the data you need or it’s too much to analyse so we can sample, which is where we might take 1,000 transactions from random points during the 12 month period.
Let’s say then that your customer demands a processing time of 10 seconds for their order and they’ll reject your product if that limit is exceeded. We know our mean and we know the standard deviation but how confident can we be? Standard deviation and the mean both have inherent variability in the underlying data set, after all.
Theory states that if you conduct the same experiment 100 times, you’re going to see some variability on the mean. This formula enables us to understand how much different each calculation of the average will be.
Before we get into the math, let’s create a quick lookup table for all the symbols we’ll be using.
When we have a sample size of greater than 30 points, we can use:
If we have a small dataset, we use the below formula.
T scores replace the Z scores as we have a smaller dataset & hence less confidence.
A worked example of standard deviation confidence scores
To create a standard deviation confidence score, we need to use a value ‘X squared’. The value of X squared depends on the number of data points in the sample – the more data we have, the greater the level of confidence.
So the formula is:
Let’s say then that we have:
- 5 data points (N)
- 7 standard deviation
- 95% desired confidence level
- X squared lower = 0.460
- X squared upper = 11.365
The formula then becomes:
Which equals [2.195, 10.907], which means, we have a 95% confidence that the standard deviation will be somewhere between 2.195 and 10.907. Generally, when we have only a few data points, these parameters are very wide, as they are in this situation.
A worked example of confidence scores of proportions
If four out of 5 doctors recommended that people used a fitness smart watch (like a Fitbit), then we’d want to know how many successes we should expect from a number of attempts.
So we can say that P = (4/5) or Y over N (Y/N). So the proportion confidence can be calculated using:
If we choose 90% confidence, our Z score will be 1.645. So, let’s put in the figures:
This leads us to:
So our probability confidence is anywhere between (0.8-0.294) and (0.8+0.294) with a 90% confidence level.
Content based on study of the Six Sigma Black Belt course and Six Sigma for Dummies