Sum of Squares (shortcuts)
The sum of the squares of the deviations from the means is given a shortcut notation and several alternative formulas.A little algebraic simplification returns:
What's wrong with the first formula, you ask? Consider the following example - the last row are the totals for the columns
- Total the data values: 23
- Divide by the number of values to get the mean: 23/5 = 4.6
- Subtract the mean from each value to get the numbers in the second column.
- Square each number in the second column to get the values in the third column.
- Total the numbers in the third column: 5.2
- Divide this total by one less than the sample size to get the variance: 5.2 / 4 = 1.3
Not too bad, you think. But this can get pretty bad if the sample mean doesn't happen to be an "nice" rational number. Think about having a mean of 19/7 = 2.714285714285... Those subtractions get nasty, and when you square them, they're really bad. Another problem with the first formula is that it requires you to know the mean ahead of time. For a calculator, this would mean that you have to save all of the numbers that were entered. The TI-82 does this, but most scientific calculators don't.
Now, let's consider the shortcut formula. The only things that you need to find are the sum of the values and the sum of the values squared. There is no subtraction and no decimals or fractions until the end. The last row contains the sums of the columns, just like before.
- Record each number in the first column and the square of each number in the second column.
- Total the first column: 23
- Total the second column: 111
- Compute the sum of squares: 111 - 23*23/5 = 111 - 105.8 = 5.2
- Divide the sum of squares by one less than the sample size to get the variance = 5.2 / 4 = 1.3