31st
August 2009
Session 1.4
Computing Probabilities Algebraically
Last Look at Long Run Interpretation / Perfect Samples
From here: http://www.mindspring.com/~cjalverson/_1sthourlyfall2008_VBKey.htm
Case Two | Long Run Argument, Perfect
Samples | Birthweight
Low birthweight
is a strong marker of complications in liveborn
infants. Low birthweight is strongly associated with
a number of complications, including infant mortality, incomplete and impaired
organ development and a number of birth defects. Suppose that the following
probability model applies to year 2005
Birthweight Status |
Probability |
Very Low Birthweight
(<1500g) |
.016 |
Low Birthweight (1500g ≤ Birthweight
< 2500g) |
.067 |
Full Birthweight
(≥ 2500g) |
.917 |
Total |
1.00 |
Each row of the
model yields a statement about an event within a population.
Interpret each
probability using the Long Run Argument.
Clearly specify
both the event and the population in an indefinite random sampling context.
In long runs of random sampling of US resident Live Births during year
2005, approximately 1.6% of sampled births present birthweights
strictly below 1500 grams.
In long runs of random sampling of US resident Live Births
during year 2005, approximately 6.7% of sampled births present birthweights of 1500 grams or greater, but strictly below
2500 grams.
In long runs of random sampling of US resident Live Births during
year 2005, approximately 91.7% of sampled births present birthweights
of 2500 grams or greater.
Compute and discuss Perfect Samples for n=2000.
Show full detail in computing an expected count for each event in
the model.
Very Low Birthweight: EVLB =
2000*Pr
Low Birthweight: ELB =
2000*Pr
Full Birthweight: EFB =
2000*Pr
Clearly specify
both the event and the population in the specific random sampling context.
In random samples of 2000 US resident Live Births during year 2005,
approximately 32 of the sampled births present birthweights
strictly below 1500 grams.
In random samples of 2000 US resident Live Births during year
2005, approximately 134 of sampled births present birthweights
of 1500 grams or greater, but strictly below 2500 grams.
In random
samples of 2000 US resident Live Births during year 2005, approximately 1834 of
sampled births present birthweights of 2500 grams or
greater.
Probability Rules
Computing Probabilities Algebraically
A Probability function
Pr
Domain: D
Range: R
For each event E in D
Pr: D(Event Space) ® PS(Probability Space)
Any event E with Pr
Events E with Pr
Any event E with Pr
Probability
Rules: A Fair, Six-sided Die
Begin with a die with six sides:
1,2,3,4,5,6. Suppose that this die is fair - that each face has an equal chance
of showing in tosses of the die. From earlier discussions, this table shouldn't
require much explanation:
Face Value |
Probability
(Proportion) |
Probability
(Percentage) |
1 |
1/6 |
100*(1/6) » 16.67% |
2 |
1/6 |
100*(1/6) » 16.67% |
3 |
1/6 |
100*(1/6) » 16.67% |
4 |
1/6 |
100*(1/6) » 16.67% |
5 |
1/6 |
100*(1/6) » 16.67% |
6 |
1/6 |
100*(1/6) » 16.67% |
Total |
6/6 |
100*(6/6) » 100%** |
The Fair d6
Model
FV: Face
Values: 1, 2, 3, 4, 5, 6
Fair Model:
Equally likely face values – 1/6 per face value
Pr
In long runs of
tosses, approximately 1 toss in 6 shows “1”.
Pr
In long runs of
tosses, approximately 1 toss in 6 shows “2”.
Pr
In long runs of
tosses, approximately 1 toss in 6 shows “3”.
Pr
In long runs of
tosses, approximately 1 toss in 6 shows “4”.
Pr
In long runs of
tosses, approximately 1 toss in 6 shows “5”.
Pr
In long runs of
tosses, approximately 1 toss in 6 shows “6”.
Basic Events
In repeated
tosses of our die, the most basic possible outcomes are the faces themselves -
the individual face values are the basic events. Each basic event has the same
probability - (1/6).
Additive
Rule: first, write
down the simple events which form the event. Then, add the probabilities for
each of those simple events - this total is the probability for the event.
Pr
Define the
event EVEN as follows:
"an even face (2,4,6) shows". Then the probability of the event EVEN
can be computed as
Pr
Complementary
Rule: first, write
down the opposite of the event. Next, write down the simple events which form
the opposite event. Then, add the probabilities for each of those simple events
- this total is the probability for the opposite event. Finally, subtract the
probability for the opposite event from 1. The result of this subtraction is
the probability for the original event.
Event=E
Opposite Event = ~E
Compute Pr
Then compute Pr
Define the event 2PLUS as "a face greater than or equal to 2
shows". Then its complementary event is Not2PLUS is "a face strictly
less than 2 shows", and can be computed as
Pr
Then compute the probability for the event 2PLUS as :
Pr
Working
directly,
Pr
Pr
Pr
(1/6) + (1/6) + (1/6) + (1/6) + (1/6) = 5/6 ≈ .8333 or as
83.33%.
A Color
Sequence Experiment
Suppose that we have a
special box - each time we press a button on the box, it prints out a sequence
of colors, in order - it prints four colors at a time. Suppose the box follows
the following Probabilities for each Color Sequence:
The Model
Color Sequence (CS) |
Probability (Proportion
| Percent) |
BBBB |
.10 | 100*(.10) = 10% |
BGGB |
.25 | 100*(.25) = 25% |
RGGR |
.05 | 100*(.05) = 05% |
YYYY |
.30 | 100*(.30) = 30% |
BYRG |
.15 | 100*(.15) = 15% |
RYYB |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) =
100% |
Let's
define the experiment: We push the button, and then the box prints out exactly
one (1) of the above listed color sequences. We then note the resulting
(printed out) color sequence.
Let's
discuss the simple (or basic) events.
The
simple events are the color sequences. The probabilities for each color
sequence are given in the table.
Suppose
we define the event E=
The only color sequence meeting the
definition of E is the sequence BBBB.
So, we write Pr
This means that in long runs of
box-prints, that approximately 10% of the prints will show as BBBB.
Suppose we define the event F=
The event F merely requires that Yellow
be present.
Yellow is present in the following color
sequences: YYYY, BYRG and RYYB.
So we write Pr
Pr
In long runs of box-prints, approximately
60% of prints will contain at least one Yellow in the color sequence.
Suppose we define the event G=
The event G requires that Green be
present in the 2nd slot.
This requirement is met in the following
color sequences: BGGB, RGGR.
So we write Pr
Pr
In long runs of box-prints,
approximately 30% of prints will show green in the 2nd slot.
Suppose we define the event H=
The event notH
requires that the sequence lead off with Red.
This requirement is met in the following
color sequences: RGGR, RYYB.
Pr
Pr
Pr
Pr
So, Pr
So in long runs of box-prints,
approximately 80% of color sequences will not show Red as the first color in
the sequence.
Suppose we define the event I=
The event notI
requires that the sequence end with Blue.
This requirement is met in the following
color sequences: BBBB, BGGB and RYYB.
Pr
Pr
Pr
Pr
So, Pr
So in long runs of box-prints,
approximately 50% of color sequences will not show Blue as the last color in
the sequence.
Events and Random Variables
Color Sequence (CS) |
Probability (Proportion
| Percent) |
BBBB |
.10 | 100*(.10) = 10% |
BGGB |
.25 | 100*(.25) = 25% |
RGGR |
.05 | 100*(.05) = 05% |
YYYY |
.30 | 100*(.30) = 30% |
BYRG |
.15 | 100*(.15) = 15% |
RYYB |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) =
100% |
Consider the random variable Blue Count, defined as
the number of blue slots showing in the sequence. Blue Count groups the color sequences
based on common value.
Blue Count |
Probability (Proportion
| Percent) |
4 |
.10 | 100*(.10) = 10% |
2 |
.25 | 100*(.25) = 25% |
0 |
.05 + .30 = .35 | 35% |
1 |
.15 + .15 = .30 | 30% |
Total |
1.00 | 100*(1.00) =
100% |
Blue Count induces four events: Blue Count = 0, Blue
Count = 1, Blue Count = 2, Blue Count = 4. If you’re being picky, you might
include an empty event for Blue Count = 3.
Pr
Pr
Pr
Pr
Pr
Color Sequence (CS) |
Probability (Proportion
| Percent) |
BBBB |
.10 | 100*(.10) = 10% |
BGGB |
.25 | 100*(.25) = 25% |
RGGR |
.05 | 100*(.05) = 05% |
YYYY |
.30 | 100*(.30) = 30% |
BYRG |
.15 | 100*(.15) = 15% |
RYYB |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) =
100% |
Consider the random variable Green Count, defined as
the number of green slots showing in the sequence. Green Count groups the color
sequences based on common value.
Color Sequence (CS) |
Probability (Proportion
| Percent) |
0 |
.10 + .30 + .15 = .55|
55% |
2 |
.25 + .05 = .30 | 30% |
1 |
.15 | 100*(.15) = 15% |
Total |
1.00 | 100*(1.00) =
100% |
Grren Count induces three events: Green Count
= 0, Green Count = 1and Green Count = 2.
Pr
Pr
Pr
Pr
Begin looking at the Color Sequence Case Type –
we haven’t covered Conditional Probability yet.
Long Run Argument/Perfect Samples – should be finished
Probability Rules
Color Slot Machine
Pairs of Dice
Random Variables
Conditional Probability