12th April 2010
Categorical Goodness of Fit
Hypothesis Testing: Goodness of Fit
One Last Coin
Be able to perform the goodness-of-fit test for categorical data.
Be able to fully
discuss the testing process and results.
This discussion must include a clear discussion of the population and the null
hypothesis, the categories of data, the family of samples, the family of errors
and the interpretation of the p-value.
500
tosses of our coin yields the following:
260 Blue;
240 Green.
Solution:
Identify
each category of classification for our Coin.
We have two categories: Blue
and Green.
Test
the null hypothesis that the Coin is Fair. Follow the steps:
Goodness-of-Fit Test
The
purpose of this Goodness of Fit test is to evaluate the evidence in a random
sample against a proposed categorical breakdown of a population.
We have two categories: Blue
and Green.
Our population consists of all
possible coin tosses.
Each member of our Family of
Samples (FoS) is a single sample of n=500 tosses of our coin. FoS consists of
all samples of this type.
Compute the following: sample size(n),counts
for each category(ni). Identify each category of
classification.
There are n=500 coin tosses in our sample, with:
nblue = 260 and ngreen = 240.
We have two categories: Blue and Green.
Null Hypothesis Note the probability(pi)
for each category specified under the null hypothesis. Compute the expected
count for each category, as ei = npi
.
Our Null Hypothesis is that the coin is fair; this requires:
pblue = 0.50
and pgreen = 0.50 . We then compute:
eblue =
500*0.50 = 250 and egreen = 500*0.50 = 250
These are the expected counts for a sample of size 500,
given the truth of the Null Hypothesis.
The
Error Estimate Compute the error term for each category of data as ( (ni
- npi)2 ) / (npi). Add the error terms
together for a total error term.
errorblue = (260-250)2 / 250 = 100/250 = .40 and
errorgreen = (240-250)2 / 250 = 100/250 = .40 . The total
error is then:
error = .40 + .40 = .80 .
P-Value/Table
Consult a Chi-Square Table. Obtain the approximate p-value.
We need the following rows from
the pearson table:
2 0.70833 0.400
2 0.87346 0.350
Since our error is between
.70833 and .87346, our p-value is somewhere between 35% and 40%.
Discussion/Interpretation
Population
and Sampling Clearly identify the population and the population
category proportions. Describe the family of samples.
We have two categories: Blue
and Green.
Our population consists of all
possible coin tosses.
pblue is the probability of observing a blue face.
pgreen is the probability of observing a green face.
Each member of our Family of
Samples (FoS) is a single sample of n=500 tosses of our coin. FoS consists of
all samples of this type.
Family
of Errors and P-Value Describe the family of samples and how each
error is computed. Apply the p-value to the family
of errors.
Each member sample of FoS
yields an error as:
errorblue = (nblue - eblue )2
/ eblue and
errorgreen = (ngreen - egreen )2
/ egreen.
The total error is then:
error = errorblue + errorgreen
.
Since our error is between
.70833 and .87346, our p-value is somewhere between 35% and 40%.
So if the coin is really fair,
then something between 35% and 40% of the Family of Samples yield errors equal to
or worse than ours.
So the sample doesn't present significance evidence against
the Null Hypothesis.
Hypothesis Testing
Goodness of Fit
Color Bowl Reduit II
We
have an actual bowl, filled with blue, purple, red and yellow chips. We will
use a sample of n=50 draws with replacement from the color bowl.
We
think that the colors in the bowl might be equally distributed. Use the sample
to test this hypothesis.
Sample I
color sample count
yellow
3
green
11
blue
21
red
15
Sample II
color sample count
yellow
5
green
11
blue
20
red
14
Sample III
color sample count
yellow
5
green
6
blue
29
red
10
Solution:
Test Results
Sample I
Obs color count
expected error totalerror
1 yellow
3 12.5
7.22 7.22
2 green
11 12.5
0.18 7.40
3 blue
21 12.5
5.78 13.18
4 red
15 12.5
0.50 13.68
Sample II
Obs color count
expected error totalerror
1 yellow
5 12.5
4.50 4.50
2 green
11 12.5
0.18 4.68
3 blue
20 12.5
4.50 9.18
4 red
14 12.5
0.18 9.36
Sample III
Obs color count
expected error totalerror
1 yellow
5 12.5
4.50 4.50
2 green
6 12.5
3.38 7.88
3 blue
29 12.5
21.78 29.66
4 red
10 12.5
0.50 30.16
Identify
each category of classification for our Coin.
We have four
categories: Blue, Green, Yellow and Red.
Test
the null hypothesis that the colors are equally likely. Follow the steps:
Goodness-of-Fit Test
The
purpose of this Goodness of Fit test is to evaluate the evidence in a random
sample against a proposed categorical breakdown of a population.
We have four categories: Blue,
Green, Yellow and Red.
Our population consists of all
possible draws with replacement from the bowl.
Each member of our Family of
Samples (FoS) is a single sample of n=50 draws with replacement from our bowl.
FoS consists of all samples of this type.
Null
Hypothesis Note the probability(pi) for each category
specified under the null hypothesis. Compute the expected count for
each category, as ei = npi .
pblue = .25
pgreen = .25
pred = .25 and
pyellow = .25 .
We then compute:
eblue = 50* pblue = 12.5
egreen = 50* pgreen = 12.5
eyellow = 50* pyellow = 12.5 and
ered = 50* pred = 12.5 .
These are the expected counts
for a sample of size 50, given the truth of the Null Hypothesis.
The
Error Estimate Compute the error term for each category of data as ( (ni
- npi)2 ) / (npi). Add the error terms
together for a total error term.
errorblue = (nblue- eblue)2 / eblue
errorgreen = (ngreen- egreen)2 / egreen
errorred = (nred- ered)2 / ered
erroryellow = (nyellow- eyellow)2 / eyellow
. The total error is then:
error = errorblue +
errorgreen + errorred + erroryellow
P-Value/Table
Consult a Chi-Square Table. Obtain the approximate p-value.
4
8.9473 0.030
4
9.8374 0.020
4
11.3449 0.010
Sample |
Total Error |
P-Value |
I |
13.68 |
<.01 |
II |
9.36 |
Between .02 and .03 |
III |
30.16 |
<.01 |
Discussion/Interpretation
Population and Sampling Clearly identify the
population and the population category proportions.
Describe the family of samples.
We have four categories: Blue,
Green, Yellow and Red.
Our population consists of all
possible draws with replacement from our bowl.
pblue is the probability of observing a blue chip.
pgreen is the probability of observing a green chip.
pyellow is the probability of observing a yellow chip.
pred is the probability of observing a red chip.
Each member of our Family of
Samples (FoS) is a single sample of n=50 draws with replacement from our bowl.
FoS consists of all samples of this type.
Family
of Errors and P-Value Describe the family of samples and how each
error is computed. Apply the p-value to the family
of errors.
Each member sample of FoS
yields an error as:
errorblue = (nblue -
eblue )2 / eblue and
errorgreen = (ngreen
- egreen )2 / egreen
erroryellow = (nyellow
- eyellow )2 / eyellow
errorred = (nred -
ered )2 / ered.
The total error is then:
error = errorblue + errorgreen + erroryellow + errorred
For sections 06 and 08, our
p-value is less than 1%. For section 07, our p-value is between 2% and 3%.
The conditional probability of
obtaining samples as bad as or worse than the errors obtained in sections 06 or
08 given equally distributed colors is less than 1%. The conditional
probability of obtaining samples as bad as or worse than the sample obtained in
section 07 is between 2% and 3%.
Samples in all sections present
significant evidence against the Null Hypothesis.
From:
http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionSkey.htm
Case Six
Hypothesis
Test – Categorical Goodness-of-Fit
LeRoy’s Sarcoma and Survival
Time Categories
We
are studying survival time in patients with LeRoy’s Sarcoma (LS), an
entirely fictitious disease. We track the survival time of entirely fictitious
patients who are diagnosed with LS. Survival times are grouped as follows: Very
Short Survival: 6 weeks or less; Abbreviated Survival: 7 weeks to 12
weeks; Regular Survival: 13 weeks to 72 weeks and Long Term Survival:
73 or more weeks. We wish to evaluate the following model for survival time in
patients with LeRoy’s Sarcoma: Pr{Very Short Term Survival: (6 weeks or
less)} = .25, Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and Pr{ Long Term
Survival: (73 weeks or more weeks)} = .25. Consider a sample of patients
with LeRoy’s Sarcoma (LS) with these survival times (in weeks): 3, 3, 3, 3,
4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 8, 10, 12, 13, 13, 14,
15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70, 80, 85, 86, 95, 110, 135, 150, 185,
253, 350, 750. Test the Hypothesis that the survival times for
patients with LeRoy’s Sarcoma are distributed as indicated in the probability
model. Show your work. Completely discuss and interpret your test results,
as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must
include a clear discussion of the population and the null hypothesis, the
family of samples, the family of errors and the interpretation of the p-value.
Numbers
Very
Short Survival: 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6
observed
= o = 20
expected
= e = 50*.25 = 12.5
error
= (o-e)2/e = (20-12.5)2/12.5 ≈ 4.5
Abbreviated
Survival: 7, 7, 8, 10, 12
o
= 5
e=50*.05=2.5
error
= (o-e)2/e = (5-2.5)2/2.5 = 2.5
Regular
Survival: 13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70
o
= 14
e=50*.45=22.5
error
= (o-e)2/e = (14-22.5)2/22.5 ≈ 3.21111
Long
Term Survival: 80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750
o
= 11
e=50*.25=12.5
error
= (o-e)2/e = (11-12.5)2/12.5 ≈ 0.18000
Total
Error = 4.5 + 2.5 + 3.21111 + 0.18000 ≈ 10.3911 over 4 categories
From
4 11.3449 0.010 and 4 9.8374 0.020, .01 < p-value < .02
Interpretation:
Population:
Cases of LeRoy’s Sarcoma
Population
Proportions: Very Short Term Survival: (6 weeks or less) = .25, Abbreviated
Survival: (7 weeks to 12 weeks) = .05, Regular Survival: (13 weeks to 72
weeks) = .45 and Long Term Survival: (73 weeks or more) = .25
Family
of Samples: Each member is a single random sample of 50 cases with LeRoy’s
Sarcoma.
For
each member of the FoS, compute:
Total
Error = {(expectedST <6 Weeks-observedST <6 Weeks)2/expectedST
<6 Weeks }+{(expectedST 7-12 Weeks-observedST 7-12
Weeks)2/expectedST 7-12 Weeks }+
{(expectedST
13-72 Weeks-observedST 13-72 Weeks)2/expectedST
13-72 Weeks }+{(expectedST 73+ Weeks-observedST 73+ Weeks)2/expectedST
73+ Weeks }.
Doing
so for every member of the FoS yields the Family of Errors.
If
the null proportions hold for LeRoy’s Sarcoma survival times, then the
probability of getting a sample as bad or worse than our sample is between
1% and 2%. This sample seems to present highly significant evidence
against the Null Hypothesis. We reject the Null Hypothesis at 5% and at 1%
significance.
Table 3. Categories/Goodness of Fit
Categ ories ERROR p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categ ories ERROR p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010 |
Categ ories ERROR p-value 6 0.0000 1.000 6 1.6103 0.900 6 2.3425 0.800 6 2.9999 0.700 6 3.6555 0.600 6 4.3515 0.500 6 4.7278 0.450 6 5.1319 0.400 6 5.5731 0.350 6 6.0644 0.300 6 6.6257 0.250 6 7.2893 0.200 6 7.6763 0.175 6 8.1152 0.150 6 8.6248 0.125 6 9.2364 0.100 6 9.5211 0.090 6 9.8366 0.080 6 10.1910 0.070 6 10.5962 0.060 6 11.0705 0.050 6 11.6443 0.040 6 12.3746 0.030 6 13.3882 0.020 6 15.0863 0.010 |
From:
http://www.mindspring.com/~cjalverson/3rdhourlyspring2008versionBkey.htm
Case Three | Categorical Goodness of Fit | Angry Barrels of
Monkeys
A company, BarrelCorpÔ manufactures
barrels and wishes to ensure the strength and quality of its barrels.
Chimpanzees traumatized the company owner as a youth; so the company uses the
following test (Angry_Barrel_of_Monkeys_Test) of its barrels:
Ten (10) chimpanzees are loaded into
the barrel. The chimpanzees are exposed to Angry!Monkey!Gas!ä,
an agent guaranteed to drive the chimpanzees to a psychotic rage. The angry,
raging, psychotic chimpanzees then destroy the barrel from the inside in an
angry, raging, psychotic fashion. The survival time, in minutes, of the barrel
is noted.
2
3
3
4
5
8
12
14
16
18
22
23
25
26
27
29
30
32
32
33
34
35
36
37
35
35
36
38
40
42
42
42
43
44
45
45
48
48
49
50
50
72
77
84
88
93
95
97
116 120
An endurance scale is defined as: Really Weak:
strictly less than 5 minutes survival time, Weak: [5,15) minutes
survival time, Adequate: [15, 30) minutes survival time, Good:
[30, 50) minutes survival time and Super Good: 50 or more minutes
survival time
Test
the hypothesis that the survival times are equally distributed among the five
survival categories. Show your work. Completely
discuss and interpret your test results, as indicated in class and case study
summaries.
Numbers
EReally Weak = N*PReally Weak = 50*.20
= 10
OReally Weak = 4
ErrorReally Weak = (OReally Weak ─ EReally Weak)2/
EReally Weak = (4 ─ 10)2/ 10 » 3.6
EWeak = N*PWeak = 50*.20 = 10
OWeak = 4
ErrorWeak = (OWeak ─ EWeak)2/
EWeak = (4 ─ 10)2/ 10 » 3.6
EAdequate = N*PAdequate = 50*.20 = 10
OAdequate = 8
ErrorAdequate = (OAdequate ─ EAdequate)2/
EAdequate = (8 ─ 10)2/ 10 » 0.4
EGood = N*PGood = 50*.20 = 10
OGood = 23
ErrorGood = (OGood ─ EGood)2/
EGood = (23 ─ 10)2/ 10 » 16.9
ESuper Good = N*PSuper Good = 50*.20 =
10
OSuper Good = 11
ErrorSuper Good = (OSuper Good ─ ESuper Good)2/
ESuper Good = (11 ─ 10)2/ 10 » 0.10
Total Error = ErrorReally Weak + ErrorWeak
+ ErrorAdequate + ErrorGood + ErrorSuper Good
= 3.6 + 3.6 + 0.4 + 16.9 + 0.10 = 24.60 over 5 categories. From 5 13.2767 0.010,
p<.01 since total error exceeds 13.2767.
Interpretation
Our population is the population of BarrelCorpÔ
barrels.Our categories are based on an endurance scale of survival under
the Angry Barrel of Monkeys Test: Really Weak: strictly
less than 5 minutes survival time, Weak: [5,15) minutes survival
time, Adequate: [15, 30) minutes survival time, Good: [30, 50)
minutes survival time and Super Good: 50 or more minutes survival time.
Our null hypothesis is that the categories are equally likely: 20% Really
Weak 20% Weak, 20% Adequate, 20% Good and 20% Super Good.
Our Family of Samples (FoS) consists of every
possible random sample of 50 BarrelCorpÔ barrels. Under the null
hypothesis, within each member of the FoS, we expect approximately 12.5
barrels per survival category:
EReally Weak = N*PReally Weak = 50*.20
= 10
EWeak = N*PWeak = 50*.20 = 10
EAdequate = N*PAdequate = 50*.20 = 10
EGood = N*PGood = 50*.20 = 10
ESuper Good = N*PSuper Good = 50*.20 =
10
From each member sample of the FoS, we compute sample counts
and errors for each level of survival:
EReally Weak = N*PReally Weak = 50*.20
= 10
ErrorReally Weak = (OReally Weak ─ EReally Weak)2/
EReally Weak
EWeak = N*PWeak = 50*.25 = 12.5
ErrorWeak = (OWeak ─ EWeak)2/
EWeak
EAdequate = N*PAdequate = 50*.25 =
12.5
ErrorAdequate = (OAdequate ─ EAdequate)2/
EAdequate
EGood = N*PGood = 50*.25 = 12.5
ErrorGood = (OGood ─ EGood)2/
EGood
ESuper Good = N*PSuper Good = 50*.25 =
12.5
ErrorSuper Good = (OSuper Good ─ ESuper Good)2/
ESuper Good
Then add the individual errors for the total error. Computing
this error for each member sample of the FoS, we obtain a Family of Errors
(FoE).
If the survival categories are equally likely, then fewer
than 1% of the member samples of the Family of Samples yields errors as large
as or larger than that of our single sample. Our sample presents highly
significant evidence against the null hypothesis.
Table 3. Categories/Goodness of Fit
Categories ERROR p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categories ERROR p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010 |
Categories ERROR p-value 6 0.0000 1.000 6 1.6103 0.900 6 2.3425 0.800 6 2.9999 0.700 6 3.6555 0.600 6 4.3515 0.500 6 4.7278 0.450 6 5.1319 0.400 6 5.5731 0.350 6 6.0644 0.300 6 6.6257 0.250 6 7.2893 0.200 6 7.6763 0.175 6 8.1152 0.150 6 8.6248 0.125 6 9.2364 0.100 6 9.5211 0.090 6 9.8366 0.080 6 10.1910 0.070 6 10.5962 0.060 6 11.0705 0.050 6 11.6443 0.040 6 12.3746 0.030 6 13.3882 0.020 6 15.0863 0.010 |
From:
http://www.mindspring.com/~cjalverson/CompFinalSummer2008verBkey.htm
Case Four | Goodness of Fit | Y2K GA Res LB Prenatal
Care
A random sample of Year 2000 Georgia resident live births
are checked for prenatal care status, in the following categories:
Prenatal Care Status |
Number in Sample |
Prenatal Care Began 1st Trimester (Months 1-3
of Pregnancy) |
415 |
Prenatal Care Began 2nd Trimester (Months 4-6
of Pregnancy) |
62 |
Prenatal Care Began 3rd Trimester (Months 7-9
of Pregnancy) |
14 |
No Prenatal Care (1st Care at Delivery) |
9 |
Total |
500 |
Our
null hypothesis is that the following probability
model applies to year 2000 Georgia Resident Live Births is correct:
Prenatal Care Status |
Probability |
Prenatal Care Began 1st Trimester (Months 1-3
of Pregnancy) |
.75 |
Prenatal Care Began 2nd Trimester (Months 4-6
of Pregnancy) |
.15 |
Prenatal Care Began 3rd Trimester (Months 7-9
of Pregnancy) |
.05 |
No Prenatal Care (1st Care at Delivery) |
.05 |
Total |
1.00 |
Test
this Hypothesis. Show your work. Completely discuss and interpret your test
results, as indicated in class and case study summaries.
Numbers
stage
O e
p error errorT
1st
415 375 0.75
4.2667 4.2667
2nd
62 75 0.15
2.2533 6.5200
3rd
14 25 0.05
4.8400 11.3600
No
9 25 0.05
10.2400 21.6000
From 4 11.3449 0.010, p < .01
Prenatal Care Began 1st Trimester (Months 1-3 of
Pregnancy)
Observed = 415
Probability from Model = .75
Expected = N*P = 500*.75 = 375
Error = (Observed ─
Expected)2/Expected = (415 ─ 375)2/375 » 4.2667
Prenatal Care Began 2nd Trimester (Months 4-6 of
Pregnancy)
Observed = 62
Probability from Model = .15
Expected = N*P = 500*.15 = 75
Error = (Observed ─
Expected)2/Expected = (62 ─ 75)2/75 = 2.2533
Prenatal Care Began 3rd Trimester (Months 7-9 of
Pregnancy)
Observed = 14
Probability from Model = .05
Expected = N*P = 500*.05 = 25
Error = (Observed ─
Expected)2/Expected = (14 ─ 25)2/25 » 4.84
No Prenatal Care (1st Care at Delivery)
Observed = 9
Probability from Model = .05
Expected = N*P = 500*.05 = 25
Error = (Observed ─
Expected)2/Expected = (9 ─ 25)2/25 » 10.24
Total Error = Error1st + Error2nd +
Error3rd + ErrorNo »
p-value from row 4 11.3449 0.010, p < .01
Interpretation
Our population consists of year 2000 Georgia resident live
born infants.
Each infant’s prenatal care status falls into a single severity
category:
Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)
Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)
Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)
No Prenatal Care (1st Care at Delivery).
Our model presents the following probabilities for each
category of care:
Pr{Prenatal Care Began 1st Trimester (Months 1-3 of
Pregnancy)} = .75
Pr{Prenatal Care Began 2nd Trimester (Months 4-6 of
Pregnancy)} = .15
Pr{Prenatal Care Began 3rd Trimester (Months 7-9 of
Pregnancy)} = .05
Pr{No Prenatal Care (1st Care at Delivery)} = .05
Each member of the family of samples is a single random
sample of 500 year 2000 Georgia resident live born infants. The family contains
all possible samples of this type.
From each member of the family of samples, compute the
observed and expected category counts, then compute an error for each category:
Prenatal Care Began 1st Trimester (Months 1-3 of
Pregnancy)
Observed
Probability from Model = .75
Expected = N*P = 500*.75 = 375
Error = (Observed ─
Expected)2/Expected
Prenatal Care Began 2nd Trimester (Months 4-6 of
Pregnancy)
Observed
Probability from Model = .15
Expected = N*P = 500*.15 = 75
Error = (Observed ─
Expected)2/Expected
Prenatal Care Began 3rd Trimester (Months 7-9 of
Pregnancy)
Observed
Probability from Model = .05
Expected = N*P = 500*.05 = 25
Error = (Observed ─
Expected)2/Expected
No Prenatal Care (1st Care at Delivery)
Observed
Probability from Model = .05
Expected = N*P = 500*.05 = 25
Error = (Observed ─
Expected)2/Expected
Total Error = Error1st + Error2nd +
Error3rd + ErrorNoPNC 4 over four categories.
If our model for prenatal care status is correct, then
less than 1% of the members of the family of samples yield errors as bad as or
worse than our error. Our sample presents highly significant evidence against
the model.
Table: Categories/Goodness of Fit
Categories ERROR p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categories ERROR p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010 |
Categories ERROR p-value 6 0.0000 1.000 6 1.6103 0.900 6 2.3425 0.800 6 2.9999 0.700 6 3.6555 0.600 6 4.3515 0.500 6 4.7278 0.450 6 5.1319 0.400 6 5.5731 0.350 6 6.0644 0.300 6 6.6257 0.250 6 7.2893 0.200 6 7.6763 0.175 6 8.1152 0.150 6 8.6248 0.125 6 9.2364 0.100 6 9.5211 0.090 6 9.8366 0.080 6 10.1910 0.070 6 10.5962 0.060 6 11.0705 0.050 6 11.6443 0.040 6 12.3746 0.030 6 13.3882 0.020 6 15.0863 0.010 |
From http://www.cjalverson.com/_3rdhourlyspring2009verBkey.htm
:
Case Four | Hypothesis Test:
Categorical Goodness of Fit | Prenatal Care
A random sample of Year
2000 Georgia resident live births are checked for prenatal care status, in the
following categories – consider the portion of the sample reporting prenatal
care:
Prenatal Care Status |
Number in Sample |
Prenatal Care Began 1st
Trimester (Months 1-3 of
Pregnancy) |
410 |
Prenatal Care Began 2nd
Trimester (Months 4-6 of
Pregnancy) |
52 |
Prenatal Care Began 3rd
Trimester (Months 7-9 of
Pregnancy) |
12 |
Test the hypothesis that:
Pr{ Prenatal
Care Began 1st Trimester (Months 1-3 of Pregnancy)} =
.75
Pr{
Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)}
= .15
Pr{
Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)}
= .10 .
Show your work. Completely discuss
and interpret your test results, as indicated in class and case study
summaries. Fully discuss the testing procedure and results. This discussion
must include a clear discussion of the population and the null hypothesis, the
family of samples, the family of errors and the interpretation of the p-value.
Show all work and detail for full credit.
Numbers
Prenatal Care Status |
n |
P |
E=nP |
Error |
PNC T1 |
410 |
0.75 |
355.5 |
8.355133615 |
PNC T2 |
52 |
0.15 |
71.1 |
5.130942335 |
PNC T3 |
12 |
0.1 |
47.4 |
26.43797468 |
Total Sample with PNC |
474 |
1 |
474 |
39.92405063 |
n=474
Prenatal Starts 1st
Trimester
Observed=410
Expected = n*PT1= 474*.75
» 355.5
Error = (Observed –
Expected)2/Expected » (410 – 355.5)2/355.5
» 8.355133615
Prenatal Starts 2nd
Trimester
Observed=52
Expected = n*PT2= 474*.15
» 71.1
Error = (Observed –
Expected)2/Expected » (52 – 71.1)2/71.1
» 5.130942335
Prenatal Starts 3rd
Trimester
Observed=12
Expected = n*PT3= 474*.10
» 47.4
Error = (Observed –
Expected)2/Expected » (12 – 47.4)2/47.4
» 26.43797468
Total Error » 8.355133615 + 5.130942335 + 26.43797468 » 39.92405063 over three categories.
From row: 3 9.2103
0.010, p-value < .01, since our error exceeds 9.2103.
Interpretation
Our population is the
population of Year 2000 Georgia resident live births. Our categories are based
on those who reported receiving Prenatal Care and include: (T1)Prenatal Care
Began 1st Trimester (Months 1-3 of Pregnancy), (T2)Prenatal Care Began 2nd
Trimester (Months 4-6 of Pregnancy) and (T3)Prenatal Care Began 3rd Trimester
(Months 7-9 of Pregnancy). Our null hypothesis is that the categories are
distributed as: 75% T1, 15% T2 and 10% T3.
Our Family of Samples
(FoS) consists of every possible random sample of 474 Year 2000 Georgia
resident live births. Under the null hypothesis, within each member of the FoS,
we expect approximately:
ET1 = N*PT1
= 474*.75 » 355.5
ET2 =
N*PT2 = 474*.15 »
71.1
ET3 = N*PT3
= 474*.10 » 47.4
From each member sample
of the FoS, we compute sample counts and errors for each level of survival:
ET1 = N*PT1
= 474*.75 » 355.5
ErrorT1 = (OT1
─ ET1)2/ ET1
ET2 =
N*PT2 = 474*.15 »
71.1
ErrorT2 =
(OT2 ─ ET2 )2/ ET2
ET3 = N*PT3
= 474*.10 » 47.4
ErrorT3 = (OT3
─ ET3)2/ ET3
Then add the individual
errors for the total error as Total Error = ErrorT1 + ErrorT2 +
ErrorT3
Computing this error for each
member sample of the FoS, we obtain a Family of Errors (FoE).
If the prenatal care
categories are distributed as: 75% T1, 15% T2 and 10% T3, then fewer than 1% of
the member samples of the Family of Samples yields errors as large as or larger
than that of our single sample. Our sample presents highly significant evidence
against the null hypothesis.
From http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionAkey.htm :
Case Six
Hypothesis Test –
Categorical Goodness-of-Fit
LeRoy’s Sarcoma and Survival Time
Categories
We are
studying survival time in patients with LeRoy’s Sarcoma (LS), an entirely
fictitious disease. We track the survival time of entirely fictitious patients
who are diagnosed with LS. Survival times are grouped as follows:
Very Short Survival:
6 weeks or less;
Abbreviated Survival:
7 weeks to 12 weeks;
Regular Survival:
13 weeks to 72 weeks and
Long Term Survival:
73 or more weeks.
We wish to
evaluate the following model for survival time in patients with LeRoy’s
Sarcoma:
Pr{Very Short Term Survival: (6 weeks or less)} = .25,
Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and
Pr{ Long Term Survival: (73 weeks or more weeks)} = .25.
Consider a
sample of patients with LeRoy’s Sarcoma (LS) with these survival times (in
weeks):
3,
3, 3, 3, 4, | 4, 4, 4, 5, 5 |, 5, 5, 5, 5, 5, | 6, 6, 6, 6, 6, (20)
7,
7, 8, 10, 12, (5)
13,
13, 14, 15, 15, | 16, 17, 18, 23, 34 |, 37, 45, 60, 70, (14)
80,
85, 86, 95, 110, | 135, 150, 185, 253, 350,| 750. (11)
Test
the Hypothesis that the survival times for patients with LeRoy’s Sarcoma are
distributed as indicated in the probability model. Show your work. Completely discuss and interpret your test
results, as indicated in class and case study summaries. Fully discuss the
testing procedure and results. This discussion must include a clear discussion
of the population and the null hypothesis, the family of samples, the family of
errors and the interpretation of the p-value.
Numbers
Very Short Survival
3, 3, 3, 3, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6
o = 20
e=50*.25=12.5
error = (o-e)2/e
= (20-12.5)2/12.5 @ 4.5
Abbreviated Survival
7, 7, 8, 10, 12
o = 5
e=50*.05=2.5
error = (o-e)2/e
= (5-2.5)2/2.5 @ 2.5
Regular Survival
13, 13, 14, 15, 15, 16,
17, 18, 23, 34, 37, 45, 60, 70
o = 14
e=50*.45=22.5
error = (o-e)2/e
= (14-22.5)2/22.5 @ 3.2111
Long Term Survival
80, 85, 86, 95, 110, 135,
150, 185, 253, 350, 750
o = 11
e=50*.25=12.5
error = (o-e)2/e
= (11-12.5)2/12.5 @ 0.18
Total Error = 4.5 + 2.5 +
3.2111 + 0.18 @ 10.3911 over 4 categories
From 4 9.8374 0.020
and 4 11.3449 0.010, .01 < p-value < .02
Interpretation:
Population: Cases of
LeRoy’s Sarcoma
Population Proportions:
Very Short Term Survival: (6 weeks or less) @ .25, Abbreviated Survival: (7
weeks to 12 weeks) @ .05, Regular Survival: (13 weeks to 72 weeks) @ .45
and Long Term Survival: (73 weeks or more) @ .25
Family of Samples: Each
member is a single random sample of 50 cases with LeRoy’s Sarcoma.
For each member of the
FoS, compute:
Total Error = {(expectedST
<6 Weeks-observedST <6 Weeks)2/expectedST
<6 Weeks }+{(expectedST 7-12 Weeks-observedST 7-12
Weeks)2/expectedST 7-12 Weeks }+
{(expectedST 13-72
Weeks-observedST 13-72 Weeks)2/expectedST
13-72 Weeks }+{(expectedST 73+ Weeks-observedST 73+ Weeks)2/expectedST
73+ Weeks }.
Doing so for every member
of the FoS yields the Family of Errors.
If the null proportions hold for LeRoy’s Sarcoma survival times,
then the probability of getting a sample as bad or worse than our sample is
between 1% and 2%. This sample seems to present significant evidence against
the Null Hypothesis.
We reject the Null Hypothesis at 5% significance.
From here http://www.cjalverson.com/_compfinalspring2009verMkey.htm
:
Case Four | Hypothesis
Test – Categorical Goodness-of-Fit | Traumatic Brain Injury
The Glasgow Coma Scale (GCS)
is the most widely used system for scoring the level of consciousness of a patient
who has had a traumatic brain injury. GCS is based on the patient's best
eye-opening, verbal, and motor responses. Each response is scored and then the
sum of the three scores is computed. That is,
Augmented
Glasgow Coma Scale Categories
Mild
= 13, 14, 15
Moderate
= 9, 10, 11, 12
Severe/Coma
= 3, 4, 5, 6, 7, 8
Pre-admission
Death/PAD/DOA = 0
Traumatic brain injury (TBI) is an insult to the brain from an external mechanical force,
possibly leading to permanent or temporary impairments of cognitive, physical,
and psychosocial functions with an associated diminished or altered state of
consciousness. Consider a random sample of patients with TBI, with GCS at
initial treatment and diagnosis listed below:
0, 0, 0, 0, 0, | 0, 0, 0, 0, 0, | 0, 0, 0, 0, (14)
3, 3, 3, 3, 3, | 4, 4, 4, 4, 4, | 4, 4, 5, 5, 5, | 5, 6, 6, 6, 6,
| 6, 7, 7, 7, 7, |
8, 8, 8, (28)
9, 9, 9, 9, 9, | 9, 9, 9, 10, 10, | 10, 10, 11, 11, 11, | 12, 12,
12, (18)
13, 13, 13, 14, 14, | 14, 14, 14, 14, 15,| 15 (11)
Our null hypothesis is that TBI case
outcomes are 20% Pre-admission Deaths, 50% Severe, 20% Moderate and 10% Mild.
Test this Hypothesis. Show your work. Completely discuss and interpret your
test results, as indicated in class and case study summaries.
Numbers
Pre-admission Death (PAD)
ObservedPAD =
14
ExpectedPAD =
n*PPAD= 71*.20 = 14.2
ErrorPAD =
(ObservedPAD – ExpectedPAD)2/ExpectedPAD
= (14 –14.2)2/14.2 » 0.00282
Severe (GCS in 3, 4, 5, 6,
7, 8)
ObservedSevere
= 28
ExpectedSevere
= n*P = 71*.50 = 35.5
ErrorSevere =
(ObservedSevere – ExpectedSevere)2/ExpectedSevere
=
(28 –35.5)2/35.5
» 1.58451
Moderate (GCS in 9, 10,
11, 12)
ObservedModerate
= 18
ExpectedModerate
= n*PModerate = 71*.20 = 14.2
ErrorModerate
= (ObservedModerate – ExpectedModerate)2/ExpectedModerate
=
(18 –14.2)2/14.2
» 1.01690
Mild (GCS in 13, 14, 15)
ObservedMild =
11
ExpectedMild =
n*PMild = 71*.10 = 7.1
ErrorMild =
(ObservedMild – ExpectedMild)2/ExpectedMild
= (11 –7.1)2/7.1 » 2.14225
Total Error = ErrorPAD
+ ErrorSevere + ErrorModerate + ErrorMild
» 0.00282 + 1.58451 + 1.01690 + 2.14225 » 4.74648 over 4 categories.
From rows: 4 4.6416
0.200 and 4 4.9566 0.175, .175 £ p £ .200.
Each member of the Family
of Samples(FoS) is a random sample of 71 Traumatic Brain Injury patients
– the FoS consists of all possible samples of this type.
From each member sample
of the FoS, compute the following items at each level of severity:
Pre-admission Death (PAD)
ObservedPAD
ExpectedPAD =
n*PPAD= 71*.20 = 14.2
ErrorPAD =
(ObservedPAD – ExpectedPAD)2/ExpectedPAD
Severe (GCS in 3, 4, 5,
6, 7, 8)
ObservedSevere
ExpectedSevere
= n*PSevere = 71*.50 = 35.5
ErrorSevere =
(ObservedSevere – ExpectedSevere)2/ExpectedSevere
Moderate (GCS in 9, 10,
11, 12)
ObservedModerate
ExpectedModerate
= n*PModerate = 71*.20 = 14.2
ErrorModerate
= (ObservedModerate – ExpectedModerate)2/ExpectedModerate
Mild (GCS in 13, 14, 15)
ObservedMild
ExpectedMild =
n*PMild = 71*.10 = 7.1
ErrorMild =
(ObservedMild – ExpectedMild)2/ExpectedMild
Then compute Total Error
= ErrorPAD + ErrorSevere + ErrorModerate +
ErrorMild
Repeating these
calculations for each member sample of the FoS yields a Family of Errors (FoE).
If the population
proportions for TBI Severity are PPAD=.20, PSevere =.50, PModerate
= .20 and PMild = .10, then between 17.5% and 20% of the member
samples of the FoS yield errors as severe or more extreme than our single
computed error. Our sample does not seem to present significant evidence
against the null hypothesis