10th
November 2010
Session 3.6
Categorical Goodness
of Fit
Case
List and Expected Progress
Descriptive
Statistics – Complete
Summary/Descriptive
Intervals – Complete
Confidence
Interval – Population Mean – Complete
Confidence
Interval – Population Proportion – Complete
Hypothesis
Test – Population Median – Complete
Remaining
Case Work
Hypothesis
Test – Population Category / Goodness of Fit – Begin Work
Hypothesis Testing:
Goodness of Fit
One Last Coin
Be able to perform the
goodness-of-fit
test for categorical data.
Be able to fully discuss the
testing process and results. This discussion must include a clear discussion of the population
and the null hypothesis, the categories of data, the family of samples, the
family of errors and the interpretation of the p-value.
500 tosses of our coin yields the
following:
260 Blue;
240 Green.
Solution:
Identify each category of
classification for our Coin.
We
have two categories: Blue and Green.
Test the null hypothesis that the
Coin is Fair. Follow the steps:
Goodness-of-Fit Test
The purpose of this Goodness of
Fit test is to evaluate the evidence in a random sample against a proposed
categorical breakdown of a population.
We
have two categories: Blue and Green.
Our
population consists of all possible coin tosses.
Each
member of our Family of Samples (FoS) is a single sample of n=500 tosses of our
coin. FoS consists of all samples of this type.
Compute the
following: sample size(n),counts for each category(ni). Identify each category of
classification.
There are n=500 coin
tosses in our sample, with:
nblue = 260 and ngreen = 240.
We have two
categories: Blue and Green.
Null Hypothesis
Note the probability(pi) for each category specified
under the null hypothesis. Compute the expected count for each category, as ei = npi .
Our Null Hypothesis is
that the coin is fair; this requires:
pblue = 0.50 and pgreen
= 0.50 . We then compute:
eblue = 500*0.50 = 250 and egreen
= 500*0.50 = 250
These are the expected
counts for a sample of size 500, given the truth of the Null Hypothesis.
The Error Estimate Compute the
error term for each category of data as ( (ni - npi)2 ) / (npi).
Add the error
terms together for a total error term.
errorblue
= (260-250)2 / 250 = 100/250 = .40 and
errorgreen = (240-250)2
/ 250 = 100/250 = .40 . The total error is then:
error
= .40 + .40 = .80 .
P-Value/Table Consult a
Chi-Square Table. Obtain the approximate p-value.
We
need the following rows from the pearson table:
2 0.70833 0.400
2 0.87346 0.350
Since
our error is between .70833 and .87346, our p-value is somewhere between 35%
and 40%.
Discussion/Interpretation
Population and Sampling Clearly
identify the
population
and the population
category proportions.
Describe the family
of samples.
We
have two categories: Blue and Green.
Our
population consists of all possible coin tosses.
pblue
is the probability of observing a blue face.
pgreen is the probability of
observing a green face.
Each
member of our Family of Samples (FoS) is a single sample of n=500 tosses of our
coin. FoS consists of all samples of this type.
Family of Errors and P-Value
Describe the family
of samples and
how each error is computed. Apply the p-value to the family of errors.
Each
member sample of FoS yields an error as:
errorblue
= (nblue - eblue )2
/ eblue and
errorgreen = (ngreen
- egreen )2 / egreen.
The
total error is then:
error = errorblue + errorgreen
.
Since
our error is between .70833 and .87346, our p-value is somewhere between 35%
and 40%.
So if
the coin is really fair, then something between 35% and 40% of the Family of
Samples yield errors equal to or worse than ours.
So the sample doesn't
present significance evidence against the Null Hypothesis.
Hypothesis Testing
Goodness of Fit
Color Bowl Reduit II
We have an actual bowl, filled
with blue, purple, red and yellow chips. We will use a sample of n=50 draws
with replacement from the color bowl.
We think that the colors in the
bowl might be equally distributed. Use the sample to test this hypothesis.
Sample I
color
sample count
yellow
3
green
11
blue
21
red
15
Sample II
color
sample count
yellow
5
green
11
blue
20
red
14
Sample III
color
sample count
yellow
5
green
6
blue
29
red
10
Solution:
Test Results
Sample I
Obs color count
expected error totalerror
1 yellow
3 12.5
7.22 7.22
2 green
11 12.5
0.18 7.40
3 blue
21 12.5
5.78 13.18
4 red
15 12.5
0.50 13.68
Sample II
Obs color count
expected error totalerror
1 yellow
5 12.5
4.50 4.50
2 green
11 12.5
0.18 4.68
3 blue
20 12.5
4.50 9.18
4 red
14 12.5
0.18 9.36
Sample III
Obs color count
expected error totalerror
1 yellow
5 12.5
4.50 4.50
2 green
6 12.5
3.38 7.88
3 blue
29 12.5
21.78 29.66
4 red
10 12.5
0.50 30.16
Identify each category of
classification for our Coin.
We have four
categories: Blue, Green, Yellow and Red.
Test the null hypothesis that the
colors are equally likely. Follow the steps:
Goodness-of-Fit Test
The purpose of this Goodness of
Fit test is to evaluate the evidence in a random sample against a proposed
categorical breakdown of a population.
We
have four categories: Blue, Green, Yellow and Red.
Our
population consists of all possible draws with replacement from the bowl.
Each
member of our Family of Samples (FoS) is a single sample of n=50 draws with
replacement from our bowl. FoS consists of all samples of this type.
Null Hypothesis Note the
probability(pi) for each category specified
under the null hypothesis. Compute the expected count for each category, as ei = npi .
Our Null
Hypothesis is that colors are equally likely; this requires:
pblue
= .25
pgreen = .25
pred = .25 and
pyellow = .25 .
We
then compute:
eblue
= 50* pblue = 12.5
egreen = 50* pgreen
= 12.5
eyellow
= 50* pyellow = 12.5 and
ered = 50* pred
= 12.5 .
These
are the expected counts for a sample of size 50, given the truth of the Null
Hypothesis.
The Error Estimate Compute the
error term for each category of data as ( (ni - npi)2 ) / (npi).
Add the error
terms together for a total error term.
errorblue
= (nblue- eblue)2 / eblue
errorgreen
= (ngreen- egreen)2 / egreen
errorred
= (nred- ered)2 / ered
erroryellow
= (nyellow- eyellow)2 / eyellow
The
total error is then: error = errorblue + errorgreen +
errorred + erroryellow
P-Value/Table Consult a Chi-Square
Table. Obtain the approximate p-value.
4
8.9473 0.030
4
9.8374 0.020
4
11.3449 0.010
Sample |
Total Error |
P-Value |
I |
13.68 |
<.01 |
II |
9.36 |
Between .02 and .03 |
III |
30.16 |
<.01 |
Discussion/Interpretation
Population and Sampling Clearly
identify the
population
and the population
category proportions.
Describe the family
of samples.
We
have four categories: Blue, Green, Yellow and Red.
Our
population consists of all possible draws with replacement from our bowl.
pblue
is the probability of observing a blue chip.
pgreen is the probability of
observing a green chip.
pyellow
is the probability of observing a yellow chip.
pred is the probability of
observing a red chip.
Each
member of our Family of Samples (FoS) is a single sample of n=50 draws with
replacement from our bowl. FoS consists of all samples of this type.
Family of Errors and P-Value
Describe the family
of samples and
how each error is computed. Apply the p-value to the family of errors.
Each
member sample of FoS yields an error as:
errorblue
= (nblue - eblue )2 / eblue and
errorgreen = (ngreen - egreen )2 / egreen
erroryellow = (nyellow - eyellow )2 / eyellow
errorred = (nred - ered )2 / ered.
The
total error is then:
error = errorblue + errorgreen + erroryellow + errorred
For
sections 06 and 08, our p-value is less than 1%. For section 07, our p-value is
between 2% and 3%.
The
conditional probability of obtaining samples as bad as or worse than the errors
obtained in sections 06 or 08 given equally distributed colors is less than 1%.
The conditional probability of obtaining samples as bad as or worse than the
sample obtained in section 07 is between 2% and 3%.
Samples
in all sections present significant evidence against the Null Hypothesis.
From: http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionSkey.htm
Case Six
Hypothesis Test –
Categorical Goodness-of-Fit
LeRoy’s Sarcoma and Survival Time
Categories
We are studying survival time in
patients with LeRoy’s Sarcoma (LS), an entirely fictitious disease. We
track the survival time of entirely fictitious patients who are diagnosed with
LS.
Survival times are grouped as
follows: Very Short Survival: 6 weeks or less; Abbreviated Survival:
7 weeks to 12 weeks; Regular Survival: 13 weeks to 72 weeks and Long
Term Survival: 73 or more weeks.
We wish to evaluate the following
model for survival time in patients with LeRoy’s Sarcoma:
Pr{Very Short Term Survival: (6
weeks or less)} = .25, Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks
to 72 weeks)} = .45, and Pr{ Long Term Survival: (73 weeks or more weeks)} =
.25.
Consider a sample of patients
with LeRoy’s Sarcoma (LS) with these survival times (in weeks):
3, 3,
3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 8, 10, 12, 13, 13,
14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70, 80, 85, 86, 95, 110, 135, 150,
185, 253, 350, 750.
Test the Hypothesis that the
survival times for patients with LeRoy’s Sarcoma are distributed as indicated
in the probability model. Show
your work. Completely discuss and interpret your test results, as indicated in
class and case study summaries. Fully
discuss the testing procedure and results. This discussion must include a clear discussion of
the population and the null hypothesis, the family of samples, the family of
errors and the interpretation of the p-value.
Numbers
Very Short Survival:
3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6
observed = o = 20
expected = e = 50*.25
= 12.5
error = (o-e)2/e
= (20-12.5)2/12.5 ≈ 4.5
Abbreviated Survival:
7, 7, 8, 10, 12
o = 5
e=50*.05=2.5
error = (o-e)2/e
= (5-2.5)2/2.5 = 2.5
Regular Survival: 13,
13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70
o = 14
e=50*.45=22.5
error = (o-e)2/e
= (14-22.5)2/22.5 ≈ 3.21111
Long Term Survival:
80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750
o = 11
e=50*.25=12.5
error = (o-e)2/e
= (11-12.5)2/12.5 ≈ 0.18000
Total Error = 4.5 +
2.5 + 3.21111 + 0.18000 ≈ 10.3911 over 4 categories
From 4 11.3449
0.010 and 4 9.8374 0.020, .01 < p-value < .02
Interpretation:
Population: Cases of
LeRoy’s Sarcoma
Population Proportions:
Very Short Term Survival: (6 weeks or less) = .25, Abbreviated Survival: (7
weeks to 12 weeks) = .05, Regular Survival: (13 weeks to 72 weeks) = .45
and Long Term Survival: (73 weeks or more) = .25
Family of Samples:
Each member is a single random sample of 50 cases with LeRoy’s Sarcoma.
For each member of the
FoS, compute:
Total Error =
{(expectedST <6 Weeks-observedST <6 Weeks)2/expectedST
<6 Weeks }+{(expectedST 7-12 Weeks-observedST 7-12
Weeks)2/expectedST 7-12 Weeks }+
{(expectedST 13-72
Weeks-observedST 13-72 Weeks)2/expectedST
13-72 Weeks }+{(expectedST 73+ Weeks-observedST 73+ Weeks)2/expectedST
73+ Weeks }.
Doing so for every
member of the FoS yields the Family of Errors.
If the null proportions
hold for LeRoy’s Sarcoma survival times, then the probability of getting a
sample as bad or worse than our sample is between 1% and 2%. This sample
seems to present highly significant evidence against the Null Hypothesis. We
reject the Null Hypothesis at 5% and at 1% significance.
Table 3. Categories/Goodness of Fit
Categ ories ERROR p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categ ories ERROR p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010
|
Categ ories ERROR p-value 6 0.0000 1.000 6 1.6103 0.900 6 2.3425 0.800 6 2.9999 0.700 6 3.6555 0.600 6 4.3515 0.500 6 4.7278 0.450 6 5.1319 0.400 6 5.5731 0.350 6 6.0644 0.300 6 6.6257 0.250 6 7.2893 0.200 6 7.6763 0.175 6 8.1152 0.150 6 8.6248 0.125 6 9.2364 0.100 6 9.5211 0.090 6 9.8366 0.080 6 10.1910 0.070 6 10.5962 0.060 6 11.0705 0.050 6 11.6443 0.040 6 12.3746 0.030 6 13.3882 0.020 6 15.0863 0.010 |
From: http://www.mindspring.com/~cjalverson/3rdhourlyspring2008versionBkey.htm
Case Three | Categorical
Goodness of Fit | Angry Barrels of Monkeys
A company, BarrelCorpÔ
manufactures barrels and wishes to ensure the strength and quality of its
barrels. Chimpanzees traumatized the company owner as a youth; so the company
uses the following test (Angry_Barrel_of_Monkeys_Test) of its barrels:
Ten (10) chimpanzees are loaded into the barrel. The chimpanzees
are exposed to Angry!Monkey!Gas!ä, an agent guaranteed to drive the
chimpanzees to a psychotic rage. The angry, raging, psychotic chimpanzees then
destroy the barrel from the inside in an angry, raging, psychotic fashion. The
survival time, in minutes, of the barrel is noted.
A random sample of 50
BarrelCorpÔ barrels is evaluated using the Angry_Barrel_of_Monkeys_Test,
and the survival time (in ***MINUTES***) of each barrel is noted. The
survival time of each barrel is listed below:
2
3
3
4
5
8
12
14
16
18
22
23
25
26
27
29
30
32
32
33
34
35
36
37
35
35
36
38
40
42
42
42
43
44
45
45
48
48
49
50
50
72
77
84
88
93
95
97
116 120
An endurance scale is
defined as: Really Weak: strictly less than 5 minutes survival
time, Weak: [5,15) minutes survival time, Adequate: [15,
30) minutes survival time, Good: [30, 50) minutes survival time and Super
Good: 50 or more minutes survival time
Test the hypothesis that the
survival times are equally distributed among the five survival categories. Show your work.
Completely discuss and interpret your test results, as indicated in class and
case study summaries.
Numbers
EReally Weak
= N*PReally Weak = 50*.20 = 10
OReally Weak
= 4
ErrorReally Weak
= (OReally Weak ─ EReally Weak)2/ EReally
Weak = (4 ─ 10)2/ 10 » 3.6
EWeak = N*PWeak
= 50*.20 = 10
OWeak = 4
ErrorWeak =
(OWeak ─ EWeak)2/ EWeak = (4 ─
10)2/ 10 » 3.6
EAdequate =
N*PAdequate = 50*.20 = 10
OAdequate =
8
ErrorAdequate
= (OAdequate ─ EAdequate)2/ EAdequate
= (8 ─ 10)2/ 10 » 0.4
EGood = N*PGood
= 50*.20 = 10
OGood = 23
ErrorGood =
(OGood ─ EGood)2/ EGood = (23
─ 10)2/ 10 » 16.9
ESuper Good
= N*PSuper Good = 50*.20 = 10
OSuper Good
= 11
ErrorSuper Good
= (OSuper Good ─ ESuper Good)2/ ESuper
Good = (11 ─ 10)2/ 10 » 0.10
Total Error = ErrorReally
Weak + ErrorWeak + ErrorAdequate + ErrorGood
+ ErrorSuper Good = 3.6 + 3.6 + 0.4 + 16.9 + 0.10 = 24.60 over 5
categories.
From 5 13.2767 0.010, p<.01 since total
error exceeds 13.2767.
Interpretation
Our population
is the population of BarrelCorpÔ barrels.
Our categories
are based on an endurance scale of survival under the Angry Barrel of
Monkeys Test: Really Weak: strictly less than 5 minutes
survival time, Weak: [5,15) minutes survival time, Adequate:
[15, 30) minutes survival time, Good: [30, 50) minutes survival time and
Super Good: 50 or more minutes survival time.
Our null hypothesis is
that the categories are equally likely: 20% Really Weak 20% Weak, 20% Adequate,
20% Good and 20% Super Good.
Our Family of
Samples (FoS) consists of every possible random sample of 50 BarrelCorpÔ
barrels. Under the null hypothesis, within each member of the FoS, we
expect approximately 12.5 barrels per survival category:
EReally Weak
= N*PReally Weak = 50*.20 = 10
EWeak = N*PWeak
= 50*.20 = 10
EAdequate =
N*PAdequate = 50*.20 = 10
EGood = N*PGood
= 50*.20 = 10
ESuper Good
= N*PSuper Good = 50*.20 = 10
From each member sample
of the FoS, we compute sample counts and errors for each level of survival:
EReally Weak
= N*PReally Weak = 50*.20 = 10
ErrorReally Weak
= (OReally Weak ─ EReally Weak)2/ EReally
Weak
EWeak = N*PWeak
= 50*.25 = 12.5
ErrorWeak =
(OWeak ─ EWeak)2/ EWeak
EAdequate =
N*PAdequate = 50*.25 = 12.5
ErrorAdequate
= (OAdequate ─ EAdequate)2/ EAdequate
EGood = N*PGood
= 50*.25 = 12.5
ErrorGood =
(OGood ─ EGood)2/ EGood
ESuper Good
= N*PSuper Good = 50*.25 = 12.5
ErrorSuper Good
= (OSuper Good ─ ESuper Good)2/ ESuper
Good
Then add the
individual errors for the total error. Computing this error for each member
sample of the FoS, we obtain a Family of Errors (FoE).
If the survival
categories are equally likely, then fewer than 1% of the member samples of the
Family of Samples yields errors as large as or larger than that of our single
sample. Our sample presents highly significant evidence against the null
hypothesis.
Table
3. Categories/Goodness of Fit
Categories ERROR
p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categories ERROR
p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010 |
Categories ERROR
p-value 6 0.0000 1.000 6 1.6103 0.900 6 2.3425 0.800 6 2.9999 0.700 6 3.6555 0.600 6 4.3515 0.500 6 4.7278 0.450 6 5.1319 0.400 6 5.5731 0.350 6 6.0644 0.300 6 6.6257 0.250 6 7.2893 0.200 6 7.6763 0.175 6 8.1152 0.150 6 8.6248 0.125 6 9.2364 0.100 6 9.5211 0.090 6 9.8366 0.080 6 10.1910 0.070 6 10.5962 0.060 6 11.0705 0.050 6 11.6443 0.040 6 12.3746 0.030 6 13.3882 0.020 6 15.0863 0.010 |
From: http://www.mindspring.com/~cjalverson/CompFinalSummer2008verBkey.htm
Case Four | Goodness
of Fit | Y2K GA Res LB Prenatal
Care
A random sample of
Year 2000 Georgia resident live births are checked for prenatal care status, in
the following categories:
Prenatal Care
Status |
Number in Sample |
Prenatal Care Began
1st Trimester (Months 1-3 of Pregnancy) |
415 |
Prenatal Care Began
2nd Trimester (Months 4-6 of Pregnancy) |
62 |
Prenatal Care Began
3rd Trimester (Months 7-9 of Pregnancy) |
14 |
No Prenatal Care (1st
Care at Delivery) |
9 |
Total |
500 |
Our null hypothesis is that the following probability model applies to year 2000
Georgia Resident Live Births is correct:
Prenatal Care
Status |
Probability |
Prenatal Care Began
1st Trimester (Months 1-3 of Pregnancy) |
.75 |
Prenatal Care Began
2nd Trimester (Months 4-6 of Pregnancy) |
.15 |
Prenatal Care Began
3rd Trimester (Months 7-9 of Pregnancy) |
.05 |
No Prenatal Care (1st
Care at Delivery) |
.05 |
Total |
1.00 |
Test this Hypothesis. Show your
work. Completely discuss and interpret your test results, as indicated in class
and case study summaries.
Numbers
stage
o e
p error errorT
1st
415 375 0.75
4.2667 4.2667
2nd
62 75 0.15
2.2533 6.5200
3rd
14 25 0.05
4.8400 11.3600
No
9 25 0.05
10.2400 21.6000
From 4 11.3449
0.010, p < .01
Prenatal Care Began 1st
Trimester (Months 1-3 of Pregnancy)
Observed = 415
Probability from Model
= .75
Expected = N*P =
500*.75 = 375
Error = (Observed ─
Expected)2/Expected = (415 ─ 375)2/375 » 4.2667
Prenatal Care Began 2nd
Trimester (Months 4-6 of Pregnancy)
Observed = 62
Probability from Model
= .15
Expected = N*P = 500*.15
= 75
Error = (Observed ─
Expected)2/Expected = (62 ─ 75)2/75 = 2.2533
Prenatal Care Began 3rd
Trimester (Months 7-9 of Pregnancy)
Observed = 14
Probability from Model
= .05
Expected = N*P =
500*.05 = 25
Error = (Observed ─
Expected)2/Expected = (14 ─ 25)2/25 » 4.84
No Prenatal Care (1st
Care at Delivery)
Observed = 9
Probability from Model
= .05
Expected = N*P =
500*.05 = 25
Error = (Observed ─
Expected)2/Expected = (9 ─ 25)2/25 » 10.24
Total Error = Error1st
+ Error2nd + Error3rd + ErrorNo »
4.2667 + 2.2533
+ 4.84 + 10.24 » 21.6 over four categories.
p-value from row 4
11.3449 0.010, p < .01
Interpretation
Our population
consists of year 2000 Georgia resident live born infants.
Each infant’s prenatal
care status falls into a single severity category:
Prenatal Care Began
1st Trimester (Months 1-3 of Pregnancy)
Prenatal Care Began
2nd Trimester (Months 4-6 of Pregnancy)
Prenatal Care Began
3rd Trimester (Months 7-9 of Pregnancy)
No Prenatal Care (1st
Care at Delivery).
Our model presents the
following probabilities for each category of care:
Pr{Prenatal Care Began
1st Trimester (Months 1-3 of Pregnancy)} = .75
Pr{Prenatal Care Began
2nd Trimester (Months 4-6 of Pregnancy)} = .15
Pr{Prenatal Care Began
3rd Trimester (Months 7-9 of Pregnancy)} = .05
Pr{No Prenatal Care
(1st Care at Delivery)} = .05
Each member of the
family of samples is a single random sample of 500 year 2000 Georgia resident
live born infants. The family contains all possible samples of this type.
From each member of
the family of samples, compute the observed and expected category counts, then
compute an error for each category:
Prenatal Care Began 1st
Trimester (Months 1-3 of Pregnancy)
Observed
Probability from Model
= .75
Expected = N*P =
500*.75 = 375
Error = (Observed ─
Expected)2/Expected
Prenatal Care Began 2nd
Trimester (Months 4-6 of Pregnancy)
Observed
Probability from Model
= .15
Expected = N*P =
500*.15 = 75
Error = (Observed ─
Expected)2/Expected
Prenatal Care Began 3rd
Trimester (Months 7-9 of Pregnancy)
Observed
Probability from Model
= .05
Expected = N*P =
500*.05 = 25
Error = (Observed ─
Expected)2/Expected
No Prenatal Care (1st
Care at Delivery)
Observed
Probability from Model
= .05
Expected = N*P =
500*.05 = 25
Error = (Observed ─
Expected)2/Expected
Total Error = Error1st
+ Error2nd + Error3rd + ErrorNoPNC 4
over four categories.
If our model for
prenatal care status is correct, then less than 1% of the members of the family
of samples yield errors as bad as or worse than our error. Our sample presents
highly significant evidence against the model.
Table:
Categories/Goodness of Fit
Categories ERROR
p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categories ERROR
p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010 |
Categories ERROR
p-value 6 0.0000 1.000 6 1.6103 0.900 6 2.3425 0.800 6 2.9999 0.700 6 3.6555 0.600 6 4.3515 0.500 6 4.7278 0.450 6 5.1319 0.400 6 5.5731 0.350 6 6.0644 0.300 6 6.6257 0.250 6 7.2893 0.200 6 7.6763 0.175 6 8.1152 0.150 6 8.6248 0.125 6 9.2364 0.100 6 9.5211 0.090 6 9.8366 0.080 6 10.1910 0.070 6 10.5962 0.060 6 11.0705 0.050 6 11.6443 0.040 6 12.3746 0.030 6 13.3882 0.020 6 15.0863 0.010 |
From http://www.cjalverson.com/_3rdhourlyspring2009verBkey.htm :
Case Four | Hypothesis Test:
Categorical Goodness of Fit | Prenatal Care
A random sample of Year
2000 Georgia resident live births are checked for prenatal care status, in the
following categories – consider the portion of the sample reporting prenatal care:
Prenatal Care Status |
Number in Sample |
Prenatal Care Began 1st
Trimester (Months 1-3 of
Pregnancy) |
410 |
Prenatal Care Began 2nd
Trimester (Months 4-6 of
Pregnancy) |
52 |
Prenatal Care Began 3rd
Trimester (Months 7-9 of
Pregnancy) |
12 |
Test the hypothesis that:
Pr{ Prenatal
Care Began 1st Trimester (Months 1-3 of Pregnancy)} =
.75
Pr{
Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)}
= .15
Pr{
Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)}
= .10 .
Show your work. Completely discuss
and interpret your test results, as indicated in class and case study
summaries. Fully discuss the testing procedure and results. This
discussion must include a clear discussion of the population and the null
hypothesis, the family of samples, the family of errors and the interpretation
of the p-value. Show all work and detail for full credit.
Numbers
Prenatal Care Status |
n |
P |
E=nP |
Error |
PNC T1 |
410 |
0.75 |
355.5 |
8.355133615 |
PNC T2 |
52 |
0.15 |
71.1 |
5.130942335 |
PNC T3 |
12 |
0.1 |
47.4 |
26.43797468 |
Total Sample with PNC |
474 |
1 |
474 |
39.92405063 |
n=474
Prenatal Starts 1st
Trimester
Observed=410
Expected = n*PT1= 474*.75
» 355.5
Error = (Observed –
Expected)2/Expected » (410 – 355.5)2/355.5 » 8.355133615
Prenatal Starts 2nd
Trimester
Observed=52
Expected = n*PT2= 474*.15
» 71.1
Error = (Observed –
Expected)2/Expected » (52 – 71.1)2/71.1 » 5.130942335
Prenatal Starts 3rd
Trimester
Observed=12
Expected = n*PT3= 474*.10
» 47.4
Error = (Observed –
Expected)2/Expected » (12 – 47.4)2/47.4 » 26.43797468
Total Error »
8.355133615 + 5.130942335 + 26.43797468 » 39.92405063 over three
categories.
From row: 3 9.2103
0.010, p-value < .01, since our error exceeds 9.2103.
Interpretation
Our population is the
population of Year 2000 Georgia resident live births. Our categories are based
on those who reported receiving Prenatal Care and include: (T1)Prenatal Care
Began 1st Trimester (Months 1-3 of Pregnancy), (T2)Prenatal Care Began 2nd Trimester
(Months 4-6 of Pregnancy) and (T3)Prenatal Care Began 3rd Trimester (Months 7-9
of Pregnancy). Our null hypothesis is that the categories are distributed as:
75% T1, 15% T2 and 10% T3.
Our Family of Samples
(FoS) consists of every possible random sample of 474 Year 2000 Georgia
resident live births. Under the null hypothesis, within each member of the FoS,
we expect approximately:
ET1 = N*PT1
= 474*.75 » 355.5
ET2 =
N*PT2 = 474*.15 » 71.1
ET3 = N*PT3
= 474*.10 » 47.4
From each member sample of
the FoS, we compute sample counts and errors for each level of survival:
ET1 = N*PT1
= 474*.75 » 355.5
ErrorT1 = (OT1
─ ET1)2/ ET1
ET2 =
N*PT2 = 474*.15 » 71.1
ErrorT2 =
(OT2 ─ ET2 )2/ ET2
ET3 = N*PT3
= 474*.10 » 47.4
ErrorT3 = (OT3
─ ET3)2/ ET3
Then add the individual
errors for the total error as Total Error = ErrorT1 + ErrorT2 +
ErrorT3. Computing
this error for each member sample of the FoS, we obtain a Family of Errors
(FoE).
If the prenatal care
categories are distributed as: 75% T1, 15% T2 and 10% T3, then fewer than 1% of
the member samples of the Family of Samples yields errors as large as or larger
than that of our single sample. Our sample presents highly significant evidence
against the null hypothesis.
From http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionAkey.htm :
Case Six
Hypothesis Test –
Categorical Goodness-of-Fit
LeRoy’s Sarcoma and Survival Time
Categories
We are studying survival time in
patients with LeRoy’s Sarcoma (LS), an entirely fictitious disease. We
track the survival time of entirely fictitious patients who are diagnosed with
LS. Survival times are grouped as follows:
Very Short Survival:
6 weeks or less;
Abbreviated Survival:
7 weeks to 12 weeks;
Regular Survival:
13 weeks to 72 weeks and
Long Term Survival:
73 or more weeks.
We wish to evaluate the following
model for survival time in patients with LeRoy’s Sarcoma:
Pr{Very Short Term Survival: (6 weeks or less)} = .25,
Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and
Pr{ Long Term Survival: (73 weeks or more weeks)} = .25.
Consider a sample of patients with LeRoy’s
Sarcoma (LS) with these survival times (in weeks):
3, 3, 3, 3, 4, | 4, 4, 4, 5, 5 |, 5, 5, 5, 5, 5, | 6, 6, 6,
6, 6, (20)
7, 7, 8, 10, 12, (5)
13, 13, 14, 15, 15, | 16, 17, 18, 23, 34 |, 37, 45, 60, 70,
(14)
80, 85, 86, 95, 110, | 135, 150, 185, 253, 350,| 750. (11)
Test the Hypothesis that the
survival times for patients with LeRoy’s Sarcoma are distributed as indicated
in the probability model. Show your work.
Completely discuss and interpret your test results, as indicated in class and
case study summaries. Fully discuss the testing procedure and results. This
discussion must include a clear discussion of the population and the null
hypothesis, the family of samples, the family of errors and the interpretation
of the p-value.
Numbers
Very Short Survival
3, 3, 3, 3, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6
o = 20
e=50*.25=12.5
error = (o-e)2/e
= (20-12.5)2/12.5 @ 4.5
Abbreviated Survival
7, 7, 8, 10, 12
o = 5
e=50*.05=2.5
error = (o-e)2/e
= (5-2.5)2/2.5 @ 2.5
Regular Survival
13, 13, 14, 15, 15, 16,
17, 18, 23, 34, 37, 45, 60, 70
o = 14
e=50*.45=22.5
error = (o-e)2/e
= (14-22.5)2/22.5 @ 3.2111
Long Term Survival
80, 85, 86, 95, 110, 135,
150, 185, 253, 350, 750
o = 11
e=50*.25=12.5
error = (o-e)2/e
= (11-12.5)2/12.5 @ 0.18
Total Error = 4.5 + 2.5 +
3.2111 + 0.18 @ 10.3911 over 4 categories
From 4 9.8374 0.020
and 4 11.3449 0.010, .01 < p-value < .02
Interpretation:
Population: Cases of
LeRoy’s Sarcoma
Population Proportions:
Very Short Term Survival: (6 weeks or less) @ .25, Abbreviated Survival: (7
weeks to 12 weeks) @ .05, Regular Survival: (13 weeks to 72 weeks) @ .45
and Long Term Survival: (73 weeks or more) @ .25
Family of Samples: Each
member is a single random sample of 50 cases with LeRoy’s Sarcoma.
For each member of the
FoS, compute:
Total Error = {(expectedST
<6 Weeks-observedST <6 Weeks)2/expectedST
<6 Weeks }+{(expectedST 7-12 Weeks-observedST 7-12 Weeks)2/expectedST
7-12 Weeks }+
{(expectedST 13-72
Weeks-observedST 13-72 Weeks)2/expectedST
13-72 Weeks }+{(expectedST 73+ Weeks-observedST 73+ Weeks)2/expectedST
73+ Weeks }.
Doing so for every member
of the FoS yields the Family of Errors.
If the null proportions
hold for LeRoy’s Sarcoma survival times, then the probability of getting a
sample as bad or worse than our sample is between 1% and 2%. This sample seems
to present significant evidence against the Null Hypothesis. We reject the Null
Hypothesis at 5% significance.
From here http://www.cjalverson.com/_compfinalspring2009verMkey.htm :
Case Four | Hypothesis
Test – Categorical Goodness-of-Fit | Traumatic Brain Injury
The Glasgow Coma Scale (GCS)
is the most widely used system for scoring the level of consciousness of a
patient who has had a traumatic brain injury. GCS is based on the patient's
best eye-opening, verbal, and motor responses. Each response is scored and then
the sum of the three scores is computed. That is,
Augmented
Glasgow Coma Scale Categories
Mild
= 13, 14, 15
Moderate =
9, 10, 11, 12
Severe/Coma
= 3, 4, 5, 6, 7, 8
Pre-admission
Death/PAD/DOA = 0
Traumatic brain injury (TBI) is an insult to the brain from an external mechanical
force, possibly leading to permanent or temporary impairments of cognitive,
physical, and psychosocial functions with an associated diminished or altered
state of consciousness. Consider a random sample of patients with TBI, with GCS
at initial treatment and diagnosis listed below:
0, 0, 0, 0, 0, | 0, 0, 0, 0, 0, | 0, 0, 0, 0, (14)
3, 3, 3, 3, 3, | 4, 4, 4, 4, 4, | 4, 4, 5, 5, 5, | 5, 6, 6, 6, 6,
| 6, 7, 7, 7, 7, |
8, 8, 8, (28)
9, 9, 9, 9, 9, | 9, 9, 9, 10, 10, | 10, 10, 11, 11, 11, | 12, 12,
12, (18)
13, 13, 13, 14, 14, | 14, 14, 14, 14, 15,| 15 (11)
Our null hypothesis is that TBI case
outcomes are 20% Pre-admission Deaths, 50% Severe, 20% Moderate and 10% Mild.
Test this Hypothesis. Show your work. Completely discuss and interpret your
test results, as indicated in class and case study summaries.
Numbers
Pre-admission Death (PAD)
ObservedPAD =
14
ExpectedPAD =
n*PPAD= 71*.20 = 14.2
ErrorPAD =
(ObservedPAD – ExpectedPAD)2/ExpectedPAD
= (14 –14.2)2/14.2 » 0.00282
Severe (GCS in 3, 4, 5,
6, 7, 8)
ObservedSevere
= 28
ExpectedSevere
= n*P = 71*.50 = 35.5
ErrorSevere =
(ObservedSevere – ExpectedSevere)2/ExpectedSevere
=
(28 –35.5)2/35.5
» 1.58451
Moderate (GCS in 9, 10,
11, 12)
ObservedModerate
= 18
ExpectedModerate
= n*PModerate = 71*.20 = 14.2
ErrorModerate
= (ObservedModerate – ExpectedModerate)2/ExpectedModerate
=
(18 –14.2)2/14.2
» 1.01690
Mild (GCS in 13, 14, 15)
ObservedMild =
11
ExpectedMild =
n*PMild = 71*.10 = 7.1
ErrorMild =
(ObservedMild – ExpectedMild)2/ExpectedMild
= (11 –7.1)2/7.1 » 2.14225
Total Error = ErrorPAD
+ ErrorSevere + ErrorModerate + ErrorMild
» 0.00282 + 1.58451 + 1.01690 + 2.14225 » 4.74648 over 4 categories.
From rows: 4 4.6416
0.200 and 4 4.9566 0.175, .175 < p < .200.
Each member of the Family
of Samples(FoS) is a random sample of 71 Traumatic Brain Injury patients
– the FoS consists of all possible samples of this type.
From each member sample
of the FoS, compute the following items at each level of severity:
Pre-admission Death (PAD)
ObservedPAD
ExpectedPAD =
n*PPAD= 71*.20 = 14.2
ErrorPAD =
(ObservedPAD – ExpectedPAD)2/ExpectedPAD
Severe (GCS in 3, 4, 5,
6, 7, 8)
ObservedSevere
ExpectedSevere
= n*PSevere = 71*.50 = 35.5
ErrorSevere =
(ObservedSevere – ExpectedSevere)2/ExpectedSevere
Moderate (GCS in 9, 10,
11, 12)
ObservedModerate
ExpectedModerate
= n*PModerate = 71*.20 = 14.2
ErrorModerate
= (ObservedModerate – ExpectedModerate)2/ExpectedModerate
Mild (GCS in 13, 14, 15)
ObservedMild
ExpectedMild =
n*PMild = 71*.10 = 7.1
ErrorMild =
(ObservedMild – ExpectedMild)2/ExpectedMild
Then compute Total Error
= ErrorPAD + ErrorSevere + ErrorModerate +
ErrorMild. Repeating these calculations for each member sample of
the FoS yields a Family of Errors (FoE).
If the population
proportions for TBI Severity are PPAD=.20, PSevere =.50, PModerate
= .20 and PMild = .10, then between 17.5% and 20% of the member
samples of the FoS yield errors as severe or more extreme than our single
computed error. Our sample does not seem to present significant evidence
against the null hypothesis