Home

SPSX

What is Mean, Variance, and Standard Deviation?

Posted by Muhammad Taheir | On: , |

Mean, Variance, and Standard Deviation


Consider the following.

The definitions for population mean and variance used with an ungrouped frequency distribution were:

Some of you might be confused by only dividing by N. Recall that this is the population variance, the sample variance, which was the unbiased estimator for the population variance was when it was divided by n-1.

Using algebra, this is equivalent to:

Recall that a probability is a long term relative frequency. So every f/N can be replaced by p(x). This simplifies to be:

What's even better, is that the last portion of the variance is the mean squared. So, the two formulas that we will be using are:



Here's the example we were working on earlier.

The mean is 7/2 or 3.5
The variance is 91/6 - (7/2)^2 = 35/12 = 2.916666...
The standard deviation is the square root of the variance = 1.7078

Do not use rounded off values in the intermediate calculations. Only round off the final answer.

What is Probability Distributions?

Posted by Muhammad Taheir | On: , |

Probability Distributions

A listing of all the values the random variable can assume with their corresponding probabilities make a probability distribution.

A note about random variables. A random variable does not mean that the values can be anything (a random number). Random variables have a well defined set of outcomes and well defined probabilities for the occurrence of each outcome. The random refers to the fact that the outcomes happen by chance -- that is, you don't know which outcome will occur next.

Here's an example probability distribution that results from the rolling of a single fair die.

What is Probability Functions?

Posted by Muhammad Taheir | On: , |

Probability Functions

A probability function is a function which assigns probabilities to the values of a random variable.


  • All the probabilities must be between 0 and 1 inclusive
  • The sum of the probabilities of the outcomes must be 1.

If these two conditions aren't met, then the function isn't a probability function. There is no requirement that the values of the random variable only be between 0 and 1, only that the probabilities be between 0 and 1.


What is Bayes' Theorem?

Posted by Muhammad Taheir | On: , |

Bayes' Theorem


However, just for the sake of argument, let's say that you want to know what Bayes' formula is.

Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C, and D instead of Aberations, Brochmailians, Chompieliens, and Defective.

P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the reverse conditional probability P(B|D).

It is based that the Given (D) is made of three parts, the part of D in A, the part of D in B, and the part of D in C.


                       P(B and D)
   P(B|D) =  -----------------------------------------
              P(A and D)  + P(B and D)  + P(C and D)
Inserting the multiplication rule for each of these joint probabilities gives
                          P(D|B)*P(B)
   P(B|D) =  -----------------------------------------
              P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)
However, and I hope you agree, it is much easier to take the joint probability divided by the marginal probability. The table does the adding for you and makes the problems doable without having to memorize the formulas.

What is Conditional Probability?

Posted by Muhammad Taheir | On: , |

Conditional Probability



Recall that the probability of an event occurring given that another event has already occurred is called a conditional probability.

The probability that event B occurs, given that event A has already occurred is

P(B|A) = P(A and B) / P(A)
This formula comes from the general multiplication principle and a little bit of algebra.

Since we are given that event A has occurred, we have a reduced sample space. Instead of the entire sample space S, we now have a sample space of A since we know A has occurred. So the old rule about being the number in the event divided by the number in the sample space still applies. It is the number in A and B (must be in A since A has occurred) divided by the number in A. If you then divided numerator and denominator of the right hand side by the number in the sample space S, then you have the probability of A and B divided by the probability of A.



Example 1:

The question, "Do you smoke?" was asked of 100 people. Results are shown in the table.
.

What is the probability of a randomly selected individual being a male who smokes? This is just a joint probability. The number of "Male and Smoke" divided by the total = 19/100 = 0.19
What is the probability of a randomly selected individual being a male? This is the total for male divided by the total = 60/100 = 0.60. Since no mention is made of smoking or not smoking, it includes all the cases.
What is the probability of a randomly selected individual smoking? Again, since no mention is made of gender, this is a marginal probability, the total who smoke divided by the total = 31/100 = 0.31.
What is the probability of a randomly selected male smoking? This time, you're told that you have a male - think of stratified sampling. What is the probability that the male smokes? Well, 19 males smoke out of 60 males, so 19/60 = 0.31666...
What is the probability that a randomly selected smoker is male? This time, you're told that you have a smoker and asked to find the probability that the smoker is also male. There are 19 male smokers out of 31 total smokers, so 19/31 = 0.6129 (approx)
After that last part, you have just worked a Bayes' Theorem problem. I know you didn't realize it - that's the beauty of it. A Bayes' problem can be set up so it appears to be just another conditional probability. In this class we will treat Bayes' problems as another conditional probability and not involve the large messy formula given in the text (and every other text).

Example 2:

There are three major manufacturing companies that make a product: Aberations, Brochmailians, and Chompielians. Aberations has a 50% market share, and Brochmailians has a 30% market share. 5% of Aberations' product is defective, 7% of Brochmailians' product is defective, and 10% of Chompieliens' product is defective.

This information can be placed into a joint probability distribution

The percent of the market share for Chompieliens wasn't given, but since the marginals must add to be 1.00, they have a 20% market share.

Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This is because they are conditional probabilities and the table is a joint probability table. These defective probabilities are conditional upon which company was given. That is, the 7% is not P(Defective), but P(Defective|Brochmailians). The joint probability P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians).

The "good" probabilities can be found by subtraction as shown above, or by multiplication using conditional probabilities. If 7% of Brochmailians' product is defective, then 93% is good. 0.93(0.30)=0.279.

What is the probability a randomly selected product is defective? P(Defective) = 0.066
What is the probability that a defective product came from Brochmailians? P(Brochmailian|Defective) = P(Brochmailian and Defective) / P(Defective) = 0.021/0.066 = 7/22 = 0.318 (approx).
Are these events independent? No. If they were, then P(Brochmailians|Defective)=0.318 would have to equal the P(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and Defective)=0.025 would have to be P(Aberations)*P(Defective) = 0.50*0.066=0.033, and it doesn't.

What is Independence Revisited?

Posted by Muhammad Taheir | On: , |

Independence Revisited


The following four statements are equivalent


  • A and B are independent events
  • P(A and B) = P(A) * P(B)
  • P(A|B) = P(A)
  • P(B|A) = P(B)

The last two are because if two events are independent, the occurrence of one doesn't change the probability of the occurrence of the other. This means that the probability of B occurring, whether A has happened or not, is simply the probability of B occurring.

What is General Multiplication Rule?

Posted by Muhammad Taheir | On: , |

General Multiplication Rule


Always works.

P(A and B) = P(A) * P(B|A)
Example 4:

P(A) = 0.20, P(B) = 0.70, P(B|A) = 0.40

A good way to think of P(B|A) is that 40% of A is B. 40% of the 20% which was in event A is 8%, thus the intersection is 0.08.