6 Sigma Series: calculating the correlation coefficient

Finding the relationship between input variables and the output from the process can be vital to implementing a successful Six Sigma strategy.  In the below chart, we can say that the data is correlated strongly. But, how strongly?


To find out, we can use a calculation, which will tell us how linear the relationship is between the variables. We can’t use this for data that follows a non-linear pattern, so it’s always worth graphing before starting your calculations.


So, we use the above formula to find R which is the correlation co-efficient.

  • N = the number of data points.
  • Xi = individual X measurement.
  • Yi = individual Y measurement.
  • X and Y Bar = mean of all X or Y measures.
  • Standard deviation of X values is depicted by the standard deviation sign.
  • Standard deviation of Y values is depicted by the standard deviation sign.

If R is less than zero, we can say that there is a negative correlation. So, if one value goes up, the other goes down. If however, it’s greater than zero, we can say that there is a positive correlation. That is, if one variable goes up, so does the other.

The closer R is to 1 or -1, the stronger the relationship is. If it’s equal to 1 or minus one, then it’s a perfect linear relationship.

You can see from the worked example below that we have a result of 0.99. Which shows a very closely, positively correlated data set.



Kieran Keene

view all posts

Join me on this career development project as I set out to develop the skills required to progress up the technology career ladder! Check out http://netshock.co.uk/about/ to find out more.