Home

SPSX

What is a Coefficient of Determination?

Posted by Muhammad Taheir | On: , |

Coefficient of Determination

In Statistics, the coefficient of determination,R2, is a value used to help interpret and analyze how many differences in one variable can be explained by a difference in a second variable. It is related directly to the coefficient of correlation, R, a value which describes how strong of a linear relationship there is between the two variables.

Calculating the Coefficient of Determination
To determine the coefficient of determination, take the square of the coefficient of correlation. As the value of correlation is always a number ranging from -1 to 1, the value of determination will always range from 0 to 1. Since the value is a square, it will always be positive, regardless of the sign of R itself.

Meaning of the Coefficient of Determination
The determination can be thought of as a percent. Roughly speaking, it tells how many of the points of data fall within the results of the line formed by the regression equation. The higher the coefficient, the higher percentage of points the line passes through when the data points and line are plotted. If the coefficient is 0.80, then 80% of the points should fall within the regression line. Values of 1 or 0 would indicate the regression line represents all or none of the data, respectively. A higher coefficient is an indicator of a better goodness of fit for the observations.

Usefulness of R
The usefulness of R2 in Statistics is its ability to determine the likelihood of future events falling within the predicted outcomes. The idea is that if more samples were added, the coefficient would show the probability of the new point falling on the line. Because it is possible to gain more samples, it is possible to test the viability of determination as a prediction tool.
Similar to correlation, it should be noted that even if there is a strong connection between the two variables, determination does not prove causality. For instance, a study on birthdays may show a large number of birthdays occur specifically within a time frame of one or two months. This does not mean that the passage of time or the change of seasons causes pregnancy.

Syntax
The coefficient of determination is often given in the syntax R2 p. The p value indicates the number of columns of data, which is useful when comparing the R2 of different data sets.