27th January 2010
Session 1.4
Computing
Probabilities Algebraically
Last Look at Long Run
Interpretation / Perfect Samples
From here: http://www.mindspring.com/~cjalverson/_1sthourlyfall2008_VBKey.htm
Case Two | Long Run Argument, Perfect Samples | Birthweight
Low birthweight is a strong marker of complications in liveborn infants. Low birthweight is strongly associated with a number of complications, including infant mortality, incomplete and impaired organ development and a number of birth defects. Suppose that the following probability model applies to year 2005 United States Resident Live Births:
Birthweight Status |
Probability |
Very Low Birthweight (<1500g) |
.016 |
Low Birthweight (1500g ≤ Birthweight
< 2500g) |
.067 |
Full Birthweight (≥
2500g) |
.917 |
Total |
1.00 |
Each row of the model yields a
statement about an event within a population.
Interpret each probability
using the Long Run Argument.
Clearly specify both the event
and the population in an indefinite random sampling context.
In long runs of random sampling
of US resident Live Births during year 2005, approximately 1.6% of sampled
births present birthweights strictly below 1500
grams.
In long runs of random
sampling of US resident Live Births during year 2005, approximately 6.7% of
sampled births present birthweights of 1500 grams or
greater, but strictly below 2500 grams.
In long runs of random sampling
of US resident Live Births during year 2005, approximately 91.7% of sampled
births present birthweights of 2500 grams or greater.
Compute
and discuss Perfect Samples for n=2000.
Show
full detail in computing an expected count for each event in the model.
Very
Low Birthweight: EVLB = 2000*Pr
Low
Birthweight: ELB = 2000*Pr
Full
Birthweight: EFB = 2000*Pr
Clearly specify both the event and
the population in the specific random sampling context.
In
random samples of 2000 US resident Live Births during year 2005, approximately
32 of the sampled births present birthweights
strictly below 1500 grams.
In
random samples of 2000 US resident Live Births during year 2005, approximately
134 of sampled births present birthweights of 1500
grams or greater, but strictly below 2500 grams.
In
random samples of 2000 US resident Live Births during year 2005, approximately
1834 of sampled births present birthweights of 2500
grams or greater.
Probability Rules
Computing
Probabilities Algebraically
A Probability function Pr
Domain: D
Range: R
For each event E in D
Pr: D(Event Space) ฎ PS(Probability Space)
Any event E with Pr
Events E with Pr
Any event E with Pr
Probability Rules: A Fair, Six-sided Die
Begin with a die with six sides: 1,2,3,4,5,6. Suppose that this die is fair - that each face has an equal chance of showing in tosses of the die. From earlier discussions, this table shouldn't require much explanation:
Face Value |
Probability (Proportion) |
Probability (Percentage) |
1 |
1/6 |
100*(1/6) ป 16.67% |
2 |
1/6 |
100*(1/6) ป 16.67% |
3 |
1/6 |
100*(1/6) ป 16.67% |
4 |
1/6 |
100*(1/6) ป 16.67% |
5 |
1/6 |
100*(1/6) ป 16.67% |
6 |
1/6 |
100*(1/6) ป 16.67% |
Total |
6/6 |
100*(6/6) ป 100%** |
The Fair d6 Model
FV: Face Values: 1, 2, 3, 4, 5, 6
Fair Model: Equally likely face
values 1/6 per face value
Pr
In long runs of tosses,
approximately 1 toss in 6 shows 1.
Pr
In long runs of tosses, approximately
1 toss in 6 shows 2.
Pr
In long runs of tosses,
approximately 1 toss in 6 shows 3.
Pr
In long runs of tosses,
approximately 1 toss in 6 shows 4.
Pr
In long runs of tosses,
approximately 1 toss in 6 shows 5.
Pr
In long runs of tosses,
approximately 1 toss in 6 shows 6.
Basic Events
In repeated tosses of our die, the most basic possible outcomes are the faces themselves - the individual face values are the basic events. Each basic event has the same probability - (1/6).
Additive Rule: first, write down the simple events which form the event.
Then, add the probabilities for each of those simple events - this total is the
probability for the event.
Pr
Define the event EVEN as follows: "an even face (2,4,6) shows". Then the probability of the event EVEN can be computed as
Pr
Complementary Rule: first, write down the opposite of the event. Next, write
down the simple events which form the opposite event. Then, add the
probabilities for each of those simple events - this total is the probability
for the opposite event. Finally, subtract the probability for the opposite
event from 1. The result of this subtraction is the probability for the
original event.
Event=E
Opposite
Event = ~E
Compute
Pr
Then
compute Pr
Define the event 2PLUS as "a face greater than or equal to 2 shows". Then its complementary event is Not2PLUS is "a face strictly less than 2 shows", and can be computed as
Pr
Then compute the probability for the event 2PLUS as :
Pr
Working directly,
Pr
Pr
Pr
(1/6)
+ (1/6) + (1/6) + (1/6) + (1/6) = 5/6 ≈ .8333 or as 83.33%.
A Color Sequence Experiment
Suppose that we have a special box - each time we press a button on the box, it prints out a sequence of colors, in order - it prints four colors at a time. Suppose the box follows the following Probabilities for each Color Sequence:
The Model
Color Sequence (CS) |
Probability (Proportion | Percent) |
BBBB |
.10 | 100*(.10) = 10% |
BGGB |
.25 | 100*(.25) = 25% |
RGGR |
.05 | 100*(.05) = 05% |
YYYY |
.30 | 100*(.30) = 30% |
BYRG |
.15 | 100*(.15) = 15% |
RYYB |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) = 100% |
Let's define the experiment: We push the button, and then the box prints out exactly one (1) of the above listed color sequences. We then note the resulting (printed out) color sequence.
Let's discuss the simple (or basic) events.
The simple events are the color sequences. The probabilities for each color sequence are given in the table.
Suppose we define the event E=
The only color sequence meeting the definition of E is the sequence BBBB.
So, we write Pr
This means that in long runs of box-prints, that approximately 10% of the prints will show as BBBB.
Suppose we define the event F=
The event F merely requires that Yellow be present.
Yellow is present in the following color sequences: YYYY, BYRG and RYYB.
So we write Pr
In long runs of box-prints, approximately 60% of prints will contain at least one Yellow in the color sequence.
Suppose we define the event G=
The event G requires that Green be present in the 2nd slot.
This requirement is met in the following color sequences: BGGB, RGGR.
So we write Pr
In long runs of box-prints, approximately 30% of prints will show green in the 2nd slot.
Suppose we define the event H=
The event notH requires that the sequence lead off with Red.
This requirement is met in the following color sequences: RGGR, RYYB.
Pr
Pr
Pr
So, Pr
So in long runs of box-prints, approximately 80% of color sequences will not show Red as the first color in the sequence.
Suppose we define the event I=
The event notI requires that the sequence end with Blue.
This requirement is met in the following color sequences: BBBB, BGGB and RYYB.
Pr
Pr
Pr
So, Pr
So in long runs of box-prints, approximately 50% of color sequences will not show Blue as the last color in the sequence.
Events and Random
Variables
Color Sequence (CS) |
Probability (Proportion | Percent) |
BBBB |
.10 | 100*(.10) = 10% |
BGGB |
.25 | 100*(.25) = 25% |
RGGR |
.05 | 100*(.05) = 05% |
YYYY |
.30 | 100*(.30) = 30% |
BYRG |
.15 | 100*(.15) = 15% |
RYYB |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) = 100% |
Consider the random
variable Blue Count, defined as the number of blue slots showing in the
sequence. Blue Count groups the color sequences based
on common value.
Blue Count |
Probability (Proportion | Percent) |
4 |
.10 | 100*(.10) = 10% |
2 |
.25 | 100*(.25) = 25% |
0 |
.05 + .30 = .35 | 35% |
1 |
.15 + .15 = .30 | 30% |
Total |
1.00 | 100*(1.00) = 100% |
Blue Count induces
four events: Blue Count = 0, Blue Count = 1, Blue Count = 2, Blue Count = 4. If
youre being picky, you might include an empty event for Blue Count = 3.
Pr
Pr
Pr
Pr
Pr
Color Sequence (CS) |
Probability (Proportion | Percent) |
BBBB |
.10 | 100*(.10) = 10% |
BGGB |
.25 | 100*(.25) = 25% |
RGGR |
.05 | 100*(.05) = 05% |
YYYY |
.30 | 100*(.30) = 30% |
BYRG |
.15 | 100*(.15) = 15% |
RYYB |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) = 100% |
Consider the random
variable Green Count, defined as the number of green slots showing in the
sequence. Green Count groups the color sequences based on common value.
Color Sequence (CS) |
Probability (Proportion | Percent) |
0 |
.10 + .30 + .15 = .55| 55% |
2 |
.25 + .05 = .30 | 30% |
1 |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) = 100% |
Green Count induces
three events: Green Count = 0, Green Count = 1and Green Count = 2.
Pr
Pr
Pr
Begin looking at the Color Sequence or Color Slot Machine Case Type but avoid cases involving Conditional Probability yet.
Long Run
Argument/Perfect Samples should be finished
Probability Rules
Color Slot Machine
start these
Pairs of Dice
Random Variables
Conditional
Probability