# 6 Sigma Series: calculating the correlation coefficient

Finding the relationship between input variables and the output from the process can be vital to implementing a successful Six Sigma strategy. In the below chart, we can say that the data is correlated strongly. But, how strongly?

To find out, we can use a calculation, which will tell us how linear the relationship is between the variables. We can’t use this for data that follows a non-linear pattern, so it’s always worth graphing before starting your calculations.

So, we use the above formula to find R which is the correlation co-efficient.

- N = the number of data points.
- Xi = individual X measurement.
- Yi = individual Y measurement.
- X and Y Bar = mean of all X or Y measures.
- Standard deviation of X values is depicted by the standard deviation sign.
- Standard deviation of Y values is depicted by the standard deviation sign.

If R is less than zero, we can say that there is a negative correlation. So, if one value goes up, the other goes down. If however, it’s greater than zero, we can say that there is a positive correlation. That is, if one variable goes up, so does the other.

The closer R is to 1 or -1, the stronger the relationship is. If it’s equal to 1 or minus one, then it’s a perfect linear relationship.

You can see from the worked example below that we have a result of 0.99. Which shows a very closely, positively correlated data set.