11th November 2009

Categorical Goodness of Fit

Hypothesis Testing: Goodness of Fit

One Last Coin

Be able to perform the goodness-of-fit test for categorical data.

Be able to fully discuss the testing process and results. This discussion must include a clear discussion of the population and the null hypothesis, the categories of data, the family of samples, the family of errors and the interpretation of the p-value.

500 tosses of our coin yields the following:

260 Blue;

240 Green.

Solution:

Identify each category of classification for our Coin.

We have two categories: Blue and Green.

Test the null hypothesis that the Coin is Fair. Follow the steps:

Goodness-of-Fit Test

The purpose of this Goodness of Fit test is to evaluate the evidence in a random sample against a proposed categorical breakdown of a population.

We have two categories: Blue and Green.

Our population consists of all possible coin tosses.

Each member of our Family of Samples (FoS) is a single sample of n=500 tosses of our coin. FoS consists of all samples of this type.

Compute the following: sample size(n),counts for each category(ni). Identify each category of classification.

There are n=500 coin tosses in our sample, with:

nblue = 260 and ngreen = 240.

We have two categories: Blue and Green.

Null Hypothesis Note the probability(pi) for each category specified under the null hypothesis. Compute the expected count for each category, as ei = npi .

Our Null Hypothesis is that the coin is fair; this requires:

pblue = 0.50 and pgreen = 0.50 . We then compute:

eblue = 500*0.50 = 250 and egreen = 500*0.50 = 250

These are the expected counts for a sample of size 500, given the truth of the Null Hypothesis.

The Error Estimate Compute the error term for each category of data as ( (ni - npi)2 ) / (npi). Add the error terms together for a total error term.

errorblue = (260-250)2 / 250 = 100/250 = .40 and

errorgreen = (240-250)2 / 250 = 100/250 = .40 . The total error is then:

error = .40 + .40 = .80 .

P-Value/Table Consult a Chi-Square Table. Obtain the approximate p-value.

We need the following rows from the pearson table:

2 0.70833 0.400

2 0.87346 0.350

Since our error is between .70833 and .87346, our p-value is somewhere between 35% and 40%.

Discussion/Interpretation

Population and Sampling Clearly identify the population and the population category proportions. Describe the family of samples.

We have two categories: Blue and Green.

Our population consists of all possible coin tosses.

pblue is the probability of observing a blue face.

pgreen is the probability of observing a green face.

Each member of our Family of Samples (FoS) is a single sample of n=500 tosses of our coin. FoS consists of all samples of this type.

Family of Errors and P-Value Describe the family of samples and how each error is computed. Apply the p-value to the family of errors.

Each member sample of FoS yields an error as:

errorblue = (nblue - eblue )2 / eblue and

errorgreen = (ngreen - egreen )2 / egreen.

The total error is then: error = errorblue + errorgreen .

Since our error is between .70833 and .87346, our p-value is somewhere between 35% and 40%.

So if the coin is really fair, then something between 35% and 40% of the Family of Samples yield errors equal to or worse than ours.

So the sample doesn't present significance evidence against the Null Hypothesis.

Hypothesis Testing

Goodness of Fit

Color Bowl Reduit II

We have an actual bowl, filled with blue, purple, red and yellow chips. We will use a sample of n=50 draws with replacement from the color bowl.

We think that the colors in the bowl might be equally distributed. Use the sample to test this hypothesis.

Sample I

color     sample count

yellow             3

green              11

blue                21

red                  15

 

Sample II

color     sample count

yellow             5

green              11

blue                20

red                  14

 

Sample III

color     sample count

yellow             5

green               6

blue                29

red                  10

Solution:

Test Results

 

Sample I 

         Obs    color     count    expected    error    totalerror

          1     yellow       3       12.5       7.22        7.22

          2     green       11       12.5       0.18        7.40

          3     blue        21       12.5       5.78       13.18

          4     red         15       12.5       0.50       13.68

 

Sample II

 

         Obs    color     count    expected    error    totalerror

          1     yellow       5       12.5       4.50       4.50

          2     green       11       12.5       0.18       4.68

          3     blue        20       12.5       4.50       9.18

          4     red         14       12.5       0.18       9.36

 

Sample III

                                           

         Obs    color     count    expected    error    totalerror

          1     yellow       5       12.5       4.50        4.50

          2     green        6       12.5       3.38        7.88

          3     blue        29       12.5      21.78       29.66

          4     red         10       12.5       0.50       30.16

Identify each category of classification for our Coin.

We have four categories: Blue, Green, Yellow and Red.

Test the null hypothesis that the colors are equally likely. Follow the steps:

Goodness-of-Fit Test

The purpose of this Goodness of Fit test is to evaluate the evidence in a random sample against a proposed categorical breakdown of a population.

We have four categories: Blue, Green, Yellow and Red.

Our population consists of all possible draws with replacement from the bowl.

Each member of our Family of Samples (FoS) is a single sample of n=50 draws with replacement from our bowl. FoS consists of all samples of this type.

Null Hypothesis Note the probability(pi) for each category specified under the null hypothesis. Compute the expected count for each category, as ei = npi .

Our Null Hypothesis is that colors are equally likely; this requires:

pblue = .25

pgreen = .25

pred = .25 and

pyellow = .25 .

We then compute:

eblue = 50* pblue = 12.5

egreen = 50* pgreen = 12.5

eyellow = 50* pyellow = 12.5 and

ered = 50* pred = 12.5 .

These are the expected counts for a sample of size 50, given the truth of the Null Hypothesis.

The Error Estimate Compute the error term for each category of data as ( (ni - npi)2 ) / (npi). Add the error terms together for a total error term.

errorblue = (nblue- eblue)2 / eblue

errorgreen = (ngreen- egreen)2 / egreen

errorred = (nred- ered)2 / ered

erroryellow = (nyellow- eyellow)2 / eyellow

. The total error is then:

error = errorblue + errorgreen + errorred + erroryellow

P-Value/Table Consult a Chi-Square Table. Obtain the approximate p-value.

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Sample

Total Error

P-Value

I

13.68

<.01

II

9.36

Between .02 and .03

III

30.16

<.01

Discussion/Interpretation

Population and Sampling Clearly identify the population and the population category proportions. Describe the family of samples.

We have four categories: Blue, Green, Yellow and Red.

Our population consists of all possible draws with replacement from our bowl.

pblue is the probability of observing a blue chip.

pgreen is the probability of observing a green chip.

pyellow is the probability of observing a yellow chip.

pred is the probability of observing a red chip.

Each member of our Family of Samples (FoS) is a single sample of n=50 draws with replacement from our bowl. FoS consists of all samples of this type.

Family of Errors and P-Value Describe the family of samples and how each error is computed. Apply the p-value to the family of errors.

Each member sample of FoS yields an error as:

errorblue = (nblue - eblue )2 / eblue and

errorgreen = (ngreen - egreen )2 / egreen

erroryellow = (nyellow - eyellow )2 / eyellow

errorred = (nred - ered )2 / ered.

The total error is then:

error = errorblue + errorgreen + erroryellow + errorred

For sections 06 and 08, our p-value is less than 1%. For section 07, our p-value is between 2% and 3%.

The conditional probability of obtaining samples as bad as or worse than the errors obtained in sections 06 or 08 given equally distributed colors is less than 1%. The conditional probability of obtaining samples as bad as or worse than the sample obtained in section 07 is between 2% and 3%.

Samples in all sections present significant evidence against the Null Hypothesis.

From: http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionSkey.htm

Case Six

Hypothesis Test – Categorical Goodness-of-Fit

LeRoy’s Sarcoma and Survival Time Categories

We are studying survival time in patients with LeRoy’s Sarcoma (LS), an entirely fictitious disease. We track the survival time of entirely fictitious patients who are diagnosed with LS. Survival times are grouped as follows: Very Short Survival: 6 weeks or less; Abbreviated Survival: 7 weeks to 12 weeks; Regular Survival: 13 weeks to 72 weeks and Long Term Survival: 73 or more weeks. We wish to evaluate the following model for survival time in patients with LeRoy’s Sarcoma: Pr{Very Short Term Survival: (6 weeks or less)} = .25, Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and Pr{ Long Term Survival: (73 weeks or more weeks)} = .25.
Consider a sample of patients with LeRoy’s Sarcoma (LS) with these survival times (in weeks): 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 8, 10, 12, 13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70, 80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750.  Test the Hypothesis that the survival times for patients with LeRoy’s Sarcoma are distributed as indicated in the probability model. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of the population and the null hypothesis, the family of samples, the family of errors and the interpretation of the p-value.

Numbers

Very Short Survival: 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6

observed = o = 20

expected = e = 50*.25 = 12.5

error = (o-e)2/e = (20-12.5)2/12.5 ≈ 4.5

 

Abbreviated Survival: 7, 7, 8, 10, 12

o = 5

e=50*.05=2.5

error = (o-e)2/e = (5-2.5)2/2.5 = 2.5

 

Regular Survival: 13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70

o = 14

e=50*.45=22.5

error = (o-e)2/e = (14-22.5)2/22.5 ≈ 3.21111

 

Long Term Survival: 80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750

o = 11

e=50*.25=12.5

error = (o-e)2/e = (11-12.5)2/12.5 ≈ 0.18000

 

Total Error = 4.5 + 2.5 + 3.21111 + 0.18000 ≈ 10.3911 over 4 categories

From 4 11.3449 0.010 and 4 9.8374 0.020, .01 < p-value < .02

 

Interpretation:

 

Population: Cases of LeRoy’s Sarcoma

Population Proportions: Very Short Term Survival: (6 weeks or less) = .25, Abbreviated Survival: (7 weeks to 12 weeks) = .05,  Regular Survival: (13 weeks to 72 weeks) = .45 and Long Term Survival: (73 weeks or more) = .25

Family of Samples: Each member is a single random sample of 50 cases with LeRoy’s Sarcoma.

For each member of the FoS, compute:

Total Error = {(expectedST <6 Weeks-observedST <6 Weeks)2/expectedST <6 Weeks }+{(expectedST 7-12 Weeks-observedST 7-12 Weeks)2/expectedST 7-12 Weeks }+

{(expectedST 13-72 Weeks-observedST 13-72 Weeks)2/expectedST 13-72 Weeks }+{(expectedST 73+ Weeks-observedST 73+ Weeks)2/expectedST 73+ Weeks }.

Doing so for every member of the FoS yields the Family of Errors.

If the null proportions hold for LeRoy’s Sarcoma survival times, then the probability of getting a sample as bad or worse than our sample is between  1% and 2%. This sample seems to present highly significant evidence against the Null Hypothesis. We reject the Null Hypothesis at 5% and at 1% significance.  

Table 3. Categories/Goodness of Fit

 

Categ

ories ERROR p-value

4 0.0000 1.000

4 0.5844 0.900

4 1.0052 0.800

4 1.4237 0.700

4 1.8692 0.600

4 2.3660 0.500

4 2.6430 0.450

4 2.9462 0.400

4 3.2831 0.350

4 3.6649 0.300

4 4.1083 0.250

4 4.6416 0.200

4 4.9566 0.175

4 5.3170 0.150

4 5.7394 0.125

4 6.2514 0.100

4 6.4915 0.090

4 6.7587 0.080

4 7.0603 0.070

4 7.4069 0.060

4 7.8147 0.050

4 8.3112 0.040

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Categ

ories ERROR p-value

5 0.0000 1.000

5 1.0636 0.900

5 1.6488 0.800

5 2.1947 0.700

5 2.7528 0.600

5 3.3567 0.500

5 3.6871 0.450

5 4.0446 0.400

5 4.4377 0.350

5 4.8784 0.300

5 5.3853 0.250

5 5.9886 0.200

5 6.3423 0.175

5 6.7449 0.150

5 7.2140 0.125

5 7.7794 0.100

5 8.0434 0.090

5 8.3365 0.080

5 8.6664 0.070

5 9.0444 0.060

5 9.4877 0.050

5 10.0255 0.040

5 10.7119 0.030

5 11.6678 0.020

5 13.2767 0.010

 

Categ

ories ERROR p-value

6 0.0000 1.000

6 1.6103 0.900

6 2.3425 0.800

6 2.9999 0.700

6 3.6555 0.600

6 4.3515 0.500

6 4.7278 0.450

6 5.1319 0.400

6 5.5731 0.350

6 6.0644 0.300

6 6.6257 0.250

6 7.2893 0.200

6 7.6763 0.175

6 8.1152 0.150

6 8.6248 0.125

6 9.2364 0.100

6 9.5211 0.090

6 9.8366 0.080

6 10.1910 0.070

6 10.5962 0.060

6 11.0705 0.050

6 11.6443 0.040

6 12.3746 0.030

6 13.3882 0.020

6 15.0863 0.010

 

 

From: http://www.mindspring.com/~cjalverson/3rdhourlyspring2008versionBkey.htm

Case Three | Categorical Goodness of Fit | Angry Barrels of Monkeys

A company, BarrelCorpÔ manufactures barrels and wishes to ensure the strength and quality of its barrels. Chimpanzees traumatized the company owner as a youth; so the company uses the following test (Angry_Barrel_of_Monkeys_Test) of its barrels:

Ten (10) chimpanzees are loaded into the barrel. The chimpanzees are exposed to Angry!Monkey!Gas!ä, an agent guaranteed to drive the chimpanzees to a psychotic rage. The angry, raging, psychotic chimpanzees then destroy the barrel from the inside in an angry, raging, psychotic fashion. The survival time, in minutes, of the barrel is noted.

A random sample of 50 BarrelCorpÔ barrels is evaluated using the Angry_Barrel_of_Monkeys_Test, and the survival time (in ***MINUTES***) of each barrel is noted. The survival time of each barrel is listed below:

2                 3                   3                   4                   5                   8                   12                   14                   16                   18     

22               23                   25                   26                   27                   29                   30                   32                   32                   33     

34               35                   36                   37               35               35                   36                   38                   40                   42     

42               42                   43                   44                   45                   45                   48                   48                   49                   50     

50               72                   77                   84                   88                   93                   95                   97             116            120

An endurance scale is defined as: Really Weak:  strictly less than 5 minutes survival time, Weak:  [5,15) minutes survival time, Adequate: [15, 30) minutes survival time, Good: [30, 50) minutes survival time and Super Good: 50 or more minutes survival time

Test the hypothesis that the survival times are equally distributed among the five survival categories. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries.

Numbers

EReally Weak = N*PReally Weak = 50*.20 = 10

OReally Weak = 4

ErrorReally Weak = (OReally Weak EReally Weak)2/ EReally Weak = (4 10)2/ 10 » 3.6

EWeak = N*PWeak = 50*.20 = 10

OWeak = 4

ErrorWeak = (OWeak EWeak)2/ EWeak = (4 10)2/ 10 » 3.6

EAdequate = N*PAdequate = 50*.20 = 10

OAdequate = 8

ErrorAdequate = (OAdequate EAdequate)2/ EAdequate = (8 10)2/ 10 » 0.4

EGood = N*PGood = 50*.20 = 10

OGood = 23

ErrorGood = (OGood EGood)2/ EGood = (23 10)2/ 10 » 16.9

ESuper Good = N*PSuper Good = 50*.20 = 10

OSuper Good = 11

ErrorSuper Good = (OSuper Good ESuper Good)2/ ESuper Good = (11 10)2/ 10 » 0.10

Total Error = ErrorReally Weak + ErrorWeak + ErrorAdequate + ErrorGood + ErrorSuper Good = 3.6 + 3.6 + 0.4 + 16.9 + 0.10 = 24.60 over 5 categories. From 5 13.2767 0.010, p<.01 since total error exceeds 13.2767.

Interpretation

Our population is the population of BarrelCorpÔ barrels.Our categories are based on an endurance scale of survival under the Angry Barrel of Monkeys Test: Really Weak:  strictly less than 5 minutes survival time, Weak:  [5,15) minutes survival time, Adequate: [15, 30) minutes survival time, Good: [30, 50) minutes survival time and Super Good: 50 or more minutes survival time. Our null hypothesis is that the categories are equally likely: 20% Really Weak 20% Weak, 20% Adequate, 20% Good and 20% Super Good.

Our Family of Samples (FoS) consists of every possible random sample of 50 BarrelCorpÔ barrels. Under the null hypothesis, within each member of the FoS, we expect approximately 12.5 barrels per survival category:

EReally Weak = N*PReally Weak = 50*.20 = 10

EWeak = N*PWeak = 50*.20 = 10

EAdequate = N*PAdequate = 50*.20 = 10

EGood = N*PGood = 50*.20 = 10

ESuper Good = N*PSuper Good = 50*.20 = 10

From each member sample of the FoS, we compute sample counts and errors for each level of survival:

 

EReally Weak = N*PReally Weak = 50*.20 = 10

ErrorReally Weak = (OReally Weak EReally Weak)2/ EReally Weak

EWeak = N*PWeak = 50*.25 = 12.5

ErrorWeak = (OWeak EWeak)2/ EWeak

EAdequate = N*PAdequate = 50*.25 = 12.5

ErrorAdequate = (OAdequate EAdequate)2/ EAdequate

EGood = N*PGood = 50*.25 = 12.5

ErrorGood = (OGood EGood)2/ EGood

ESuper Good = N*PSuper Good = 50*.25 = 12.5

ErrorSuper Good = (OSuper Good ESuper Good)2/ ESuper Good

 

Then add the individual errors for the total error. Computing this error for each member sample of the FoS, we obtain a Family of Errors (FoE).

If the survival categories are equally likely, then fewer than 1% of the member samples of the Family of Samples yields errors as large as or larger than that of our single sample. Our sample presents highly significant evidence against the null hypothesis.

Table 3. Categories/Goodness of Fit

Categories ERROR p-value

4 0.0000 1.000

4 0.5844 0.900

4 1.0052 0.800

4 1.4237 0.700

4 1.8692 0.600

4 2.3660 0.500

4 2.6430 0.450

4 2.9462 0.400

4 3.2831 0.350

4 3.6649 0.300

4 4.1083 0.250

4 4.6416 0.200

4 4.9566 0.175

4 5.3170 0.150

4 5.7394 0.125

4 6.2514 0.100

4 6.4915 0.090

4 6.7587 0.080

4 7.0603 0.070

4 7.4069 0.060

4 7.8147 0.050

4 8.3112 0.040

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Categories ERROR p-value

5 0.0000 1.000

5 1.0636 0.900

5 1.6488 0.800

5 2.1947 0.700

5 2.7528 0.600

5 3.3567 0.500

5 3.6871 0.450

5 4.0446 0.400

5 4.4377 0.350

5 4.8784 0.300

5 5.3853 0.250

5 5.9886 0.200

5 6.3423 0.175

5 6.7449 0.150

5 7.2140 0.125

5 7.7794 0.100

5 8.0434 0.090

5 8.3365 0.080

5 8.6664 0.070

5 9.0444 0.060

5 9.4877 0.050

5 10.0255 0.040

5 10.7119 0.030

5 11.6678 0.020

5 13.2767 0.010

 

Categories ERROR p-value

6 0.0000 1.000

6 1.6103 0.900

6 2.3425 0.800

6 2.9999 0.700

6 3.6555 0.600

6 4.3515 0.500

6 4.7278 0.450

6 5.1319 0.400

6 5.5731 0.350

6 6.0644 0.300

6 6.6257 0.250

6 7.2893 0.200

6 7.6763 0.175

6 8.1152 0.150

6 8.6248 0.125

6 9.2364 0.100

6 9.5211 0.090

6 9.8366 0.080

6 10.1910 0.070

6 10.5962 0.060

6 11.0705 0.050

6 11.6443 0.040

6 12.3746 0.030

6 13.3882 0.020

6 15.0863 0.010

 

From: http://www.mindspring.com/~cjalverson/CompFinalSummer2008verBkey.htm

Case Four | Goodness of Fit  | Y2K GA Res LB Prenatal Care

A random sample of Year 2000 Georgia resident live births are checked for prenatal care status, in the following categories:

 

Prenatal Care Status

Number in Sample

Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)

415

Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)

62

Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)

14

No Prenatal Care (1st Care at Delivery)

9

Total

500

Our null hypothesis is that the following probability model applies to year 2000 Georgia Resident Live Births is correct:

Prenatal Care Status

Probability

Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)

.75

Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)

.15

Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)

.05

No Prenatal Care (1st Care at Delivery)

.05

Total

1.00

 

Test this Hypothesis. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries.

 

Numbers

stage      O     e       p      error      errorT

1st     415    375    0.75     4.2667     4.2667

2nd      62     75    0.15     2.2533     6.5200

3rd      14     25    0.05     4.8400    11.3600

No        9     25    0.05    10.2400    21.6000

From 4 11.3449 0.010, p < .01

Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)

Observed = 415

Probability from Model = .75

Expected = N*P = 500*.75 = 375

Error = (Observed Expected)2/Expected = (415 375)2/375 » 4.2667

 

Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)

Observed = 62

Probability from Model = .15

Expected = N*P = 500*.15 = 75

Error = (Observed Expected)2/Expected = (62 75)2/75 = 2.2533

 

Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)

Observed = 14

Probability from Model = .05

Expected = N*P = 500*.05 = 25

Error = (Observed Expected)2/Expected = (14 25)2/25 » 4.84

 

No Prenatal Care (1st Care at Delivery)

Observed = 9

Probability from Model = .05

Expected = N*P = 500*.05  = 25

Error = (Observed Expected)2/Expected = (9 25)2/25 » 10.24

 

Total Error = Error1st + Error2nd + Error3rd + ErrorNo »

4.2667 + 2.2533 +  4.84 +  10.24 » 21.6 over four categories.

 

p-value from row 4 11.3449 0.010, p < .01

 

Interpretation

 

Our population consists of year 2000 Georgia resident live born infants.

 

Each infant’s prenatal care status falls into a single severity category:

Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)

Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)

Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)

No Prenatal Care (1st Care at Delivery).

 

Our model presents the following probabilities for each category of care:

Pr{Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)} = .75

Pr{Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)} = .15

Pr{Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)} = .05

Pr{No Prenatal Care (1st Care at Delivery)} = .05

 

Each member of the family of samples is a single random sample of 500 year 2000 Georgia resident live born infants. The family contains all possible samples of this type.

 

From each member of the family of samples, compute the observed and expected category counts, then compute an error for each category:

 

Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)

Observed

Probability from Model = .75

Expected = N*P = 500*.75 = 375

Error = (Observed Expected)2/Expected

 

Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)

Observed

Probability from Model = .15

Expected = N*P = 500*.15 = 75

Error = (Observed Expected)2/Expected

 

Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)

Observed

Probability from Model = .05

Expected = N*P = 500*.05 = 25

Error = (Observed Expected)2/Expected

 

No Prenatal Care (1st Care at Delivery)

Observed

Probability from Model = .05

Expected = N*P = 500*.05  = 25

Error = (Observed Expected)2/Expected

 

Total Error = Error1st + Error2nd  + Error3rd + ErrorNoPNC 4 over four categories.

 

If  our model for prenatal care status is correct, then less than 1% of the members of the family of samples yield errors as bad as or worse than our error. Our sample presents highly significant evidence against the model.

 


 

Table: Categories/Goodness of Fit

 

Categories ERROR p-value

4 0.0000 1.000

4 0.5844 0.900

4 1.0052 0.800

4 1.4237 0.700

4 1.8692 0.600

4 2.3660 0.500

4 2.6430 0.450

4 2.9462 0.400

4 3.2831 0.350

4 3.6649 0.300

4 4.1083 0.250

4 4.6416 0.200

4 4.9566 0.175

4 5.3170 0.150

4 5.7394 0.125

4 6.2514 0.100

4 6.4915 0.090

4 6.7587 0.080

4 7.0603 0.070

4 7.4069 0.060

4 7.8147 0.050

4 8.3112 0.040

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Categories ERROR p-value

5 0.0000 1.000

5 1.0636 0.900

5 1.6488 0.800

5 2.1947 0.700

5 2.7528 0.600

5 3.3567 0.500

5 3.6871 0.450

5 4.0446 0.400

5 4.4377 0.350

5 4.8784 0.300

5 5.3853 0.250

5 5.9886 0.200

5 6.3423 0.175

5 6.7449 0.150

5 7.2140 0.125

5 7.7794 0.100

5 8.0434 0.090

5 8.3365 0.080

5 8.6664 0.070

5 9.0444 0.060

5 9.4877 0.050

5 10.0255 0.040

5 10.7119 0.030

5 11.6678 0.020

5 13.2767 0.010

 

Categories ERROR p-value

6 0.0000 1.000

6 1.6103 0.900

6 2.3425 0.800

6 2.9999 0.700

6 3.6555 0.600

6 4.3515 0.500

6 4.7278 0.450

6 5.1319 0.400

6 5.5731 0.350

6 6.0644 0.300

6 6.6257 0.250

6 7.2893 0.200

6 7.6763 0.175

6 8.1152 0.150

6 8.6248 0.125

6 9.2364 0.100

6 9.5211 0.090

6 9.8366 0.080

6 10.1910 0.070

6 10.5962 0.060

6 11.0705 0.050

6 11.6443 0.040

6 12.3746 0.030

6 13.3882 0.020

6 15.0863 0.010

 

 

 

 

 

 

 

 

 

 

 

From http://www.cjalverson.com/_3rdhourlyspring2009verBkey.htm :

 

Case Four | Hypothesis Test: Categorical Goodness of Fit | Prenatal Care

 

A random sample of Year 2000 Georgia resident live births are checked for prenatal care status, in the following categories – consider the portion of the sample reporting prenatal care:

 

 

 

Prenatal Care Status

Number in Sample

Prenatal Care Began 1st Trimester

(Months 1-3 of Pregnancy)

410

Prenatal Care Began 2nd Trimester

(Months 4-6 of Pregnancy)

52

Prenatal Care Began 3rd Trimester

(Months 7-9 of Pregnancy)

12

Test the hypothesis that:

Pr{ Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)} = .75

Pr{ Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)} = .15

Pr{ Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)} = .10 .

Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of the population and the null hypothesis, the family of samples, the family of errors and the interpretation of the p-value. Show all work and detail for full credit.

 

Numbers

 

Prenatal Care Status

n

P

E=nP

Error

PNC T1

410

0.75

355.5

8.355133615

PNC T2

52

0.15

71.1

5.130942335

PNC T3

12

0.1

47.4

26.43797468

Total Sample with PNC

474

1

474

39.92405063

 

n=474

 

Prenatal Starts 1st Trimester

Observed=410

Expected = n*PT1= 474*.75 » 355.5

Error = (Observed – Expected)2/Expected » (410 – 355.5)2/355.5 » 8.355133615

 

Prenatal Starts 2nd Trimester

Observed=52

Expected = n*PT2= 474*.15 » 71.1

Error = (Observed – Expected)2/Expected » (52 – 71.1)2/71.1 » 5.130942335

 

 

Prenatal Starts 3rd Trimester

Observed=12

Expected = n*PT3= 474*.10 » 47.4

Error = (Observed – Expected)2/Expected » (12 – 47.4)2/47.4 » 26.43797468

 

Total Error » 8.355133615 + 5.130942335 + 26.43797468 » 39.92405063 over three categories.

 

From row: 3 9.2103 0.010, p-value < .01, since our error exceeds 9.2103. 

 

 

 

Interpretation

Our population is the population of Year 2000 Georgia resident live births. Our categories are based on those who reported receiving Prenatal Care and include: (T1)Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy), (T2)Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy) and (T3)Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy). Our null hypothesis is that the categories are distributed as: 75% T1, 15% T2  and 10% T3.

Our Family of Samples (FoS) consists of every possible random sample of 474 Year 2000 Georgia resident live births. Under the null hypothesis, within each member of the FoS, we expect approximately:

ET1 = N*PT1 = 474*.75 » 355.5

ET2  = N*PT2  = 474*.15 » 71.1

ET3 = N*PT3  = 474*.10 » 47.4

 

From each member sample of the FoS, we compute sample counts and errors for each level of survival:

 

ET1 = N*PT1 = 474*.75 » 355.5

ErrorT1 = (OT1 ─ ET1)2/ ET1

 

ET2  = N*PT2  = 474*.15 » 71.1

ErrorT2  = (OT2  ─ ET2 )2/ ET2  

 

ET3 = N*PT3  = 474*.10 » 47.4

ErrorT3 = (OT3 ─ ET3)2/ ET3

 

Then add the individual errors for the total error as Total Error = ErrorT1 + ErrorT2  + ErrorT3

Computing this error for each member sample of the FoS, we obtain a Family of Errors (FoE).

If the prenatal care categories are distributed as: 75% T1, 15% T2 and 10% T3, then fewer than 1% of the member samples of the Family of Samples yields errors as large as or larger than that of our single sample. Our sample presents highly significant evidence against the null hypothesis.

 

From http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionAkey.htm  :

 

 

 

 

 

 

Case Six

Hypothesis Test – Categorical Goodness-of-Fit

LeRoy’s Sarcoma and Survival Time Categories

We are studying survival time in patients with LeRoy’s Sarcoma (LS), an entirely fictitious disease. We track the survival time of entirely fictitious patients who are diagnosed with LS. Survival times are grouped as follows:

Very Short Survival: 6 weeks or less;

Abbreviated Survival: 7 weeks to 12 weeks;

Regular Survival: 13 weeks to 72 weeks and

Long Term Survival: 73 or more weeks.

We wish to evaluate the following model for survival time in patients with LeRoy’s Sarcoma:

 

 

Pr{Very Short Term Survival: (6 weeks or less)} = .25,

Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and

Pr{ Long Term Survival: (73 weeks or more weeks)} = .25.

Consider a sample of patients with LeRoy’s Sarcoma (LS) with these survival times (in weeks):

3, 3, 3, 3, 4, | 4, 4, 4, 5, 5 |, 5, 5, 5, 5, 5, | 6, 6, 6, 6, 6, (20)

7, 7, 8, 10, 12, (5)

13, 13, 14, 15, 15, | 16, 17, 18, 23, 34 |, 37, 45, 60, 70, (14)

80, 85, 86, 95, 110, | 135, 150, 185, 253, 350,| 750. (11)  

Test the Hypothesis that the survival times for patients with LeRoy’s Sarcoma are distributed as indicated in the probability model. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of the population and the null hypothesis, the family of samples, the family of errors and the interpretation of the p-value.

Numbers

Very Short Survival

3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6

o = 20

e=50*.25=12.5

error = (o-e)2/e = (20-12.5)2/12.5 @ 4.5

Abbreviated Survival

7, 7, 8, 10, 12

o = 5

e=50*.05=2.5

error = (o-e)2/e = (5-2.5)2/2.5 @ 2.5

Regular Survival

13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70

o = 14

e=50*.45=22.5

error = (o-e)2/e = (14-22.5)2/22.5 @ 3.2111

Long Term Survival

80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750

o = 11

e=50*.25=12.5

error = (o-e)2/e = (11-12.5)2/12.5 @ 0.18

Total Error = 4.5 + 2.5 + 3.2111 + 0.18 @ 10.3911 over 4 categories

From 4 9.8374 0.020 and 4 11.3449 0.010, .01 < p-value < .02

Interpretation:

Population: Cases of LeRoy’s Sarcoma

Population Proportions: Very Short Term Survival: (6 weeks or less) @ .25, Abbreviated Survival: (7 weeks to 12 weeks) @ .05,  Regular Survival: (13 weeks to 72 weeks) @ .45 and Long Term Survival: (73 weeks or more) @ .25

Family of Samples: Each member is a single random sample of 50 cases with LeRoy’s Sarcoma.

For each member of the FoS, compute:

Total Error = {(expectedST <6 Weeks-observedST <6 Weeks)2/expectedST <6 Weeks }+{(expectedST 7-12 Weeks-observedST 7-12 Weeks)2/expectedST 7-12 Weeks }+

{(expectedST 13-72 Weeks-observedST 13-72 Weeks)2/expectedST 13-72 Weeks }+{(expectedST 73+ Weeks-observedST 73+ Weeks)2/expectedST 73+ Weeks }.

Doing so for every member of the FoS yields the Family of Errors.

If the null proportions hold for LeRoy’s Sarcoma survival times, then the probability of getting a sample as bad or worse than our sample is between 1% and 2%. This sample seems to present significant evidence against the Null Hypothesis.

We reject the Null Hypothesis at 5% significance.

From here http://www.cjalverson.com/_compfinalspring2009verMkey.htm :

 

Case Four | Hypothesis Test – Categorical Goodness-of-Fit | Traumatic Brain Injury

 

The Glasgow Coma Scale (GCS) is the most widely used system for scoring the level of consciousness of a patient who has had a traumatic brain injury. GCS is based on the patient's best eye-opening, verbal, and motor responses. Each response is scored and then the sum of the three scores is computed. That is,

 

Augmented Glasgow Coma Scale Categories

Mild  = 13, 14, 15

Moderate = 9, 10, 11, 12

Severe/Coma =  3, 4, 5, 6, 7, 8

Pre-admission Death/PAD/DOA = 0

 

Traumatic brain injury (TBI) is an insult to the brain from an external mechanical force, possibly leading to permanent or temporary impairments of cognitive, physical, and psychosocial functions with an associated diminished or altered state of consciousness. Consider a random sample of patients with TBI, with GCS at initial treatment and diagnosis listed below:

 

0, 0, 0, 0, 0, | 0, 0, 0, 0, 0, | 0, 0, 0, 0, (14)

3, 3, 3, 3, 3, | 4, 4, 4, 4, 4, | 4, 4, 5, 5, 5, | 5, 6, 6, 6, 6, | 6, 7, 7, 7, 7, |

8, 8,  8, (28)

9, 9, 9, 9, 9, | 9, 9, 9, 10, 10, | 10, 10, 11, 11, 11, | 12, 12, 12, (18)

13, 13, 13, 14, 14, | 14, 14, 14, 14, 15,| 15 (11)

 

Our null hypothesis is that TBI case outcomes are 20% Pre-admission Deaths, 50% Severe, 20% Moderate and 10% Mild. Test this Hypothesis. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries.

 

Numbers

 

Pre-admission Death (PAD)

 

ObservedPAD = 14

ExpectedPAD = n*PPAD= 71*.20 = 14.2

ErrorPAD = (ObservedPAD – ExpectedPAD)2/ExpectedPAD = (14 –14.2)2/14.2 » 0.00282

 

Severe (GCS in 3, 4, 5, 6, 7, 8)

 

ObservedSevere = 28

ExpectedSevere = n*P = 71*.50 = 35.5

ErrorSevere = (ObservedSevere – ExpectedSevere)2/ExpectedSevere =

(28 –35.5)2/35.5 » 1.58451

 

Moderate (GCS in 9, 10, 11, 12)

 

ObservedModerate = 18

ExpectedModerate = n*PModerate = 71*.20 = 14.2

ErrorModerate = (ObservedModerate – ExpectedModerate)2/ExpectedModerate =

(18 –14.2)2/14.2 » 1.01690

 

Mild (GCS in 13, 14, 15)

 

ObservedMild = 11

ExpectedMild = n*PMild = 71*.10 = 7.1

ErrorMild = (ObservedMild – ExpectedMild)2/ExpectedMild = (11 –7.1)2/7.1 » 2.14225

 

Total Error = ErrorPAD + ErrorSevere + ErrorModerate + ErrorMild » 0.00282 + 1.58451 + 1.01690 + 2.14225 » 4.74648 over 4 categories.

 

From rows: 4 4.6416 0.200 and 4 4.9566 0.175, .175 £ p £ .200.

 

Each member of the Family of Samples(FoS)  is a random sample of 71 Traumatic Brain Injury patients – the FoS consists of all possible samples of this type.

 

From each member sample of the FoS, compute the following items at each level of severity:

 

Pre-admission Death (PAD)

ObservedPAD

ExpectedPAD = n*PPAD= 71*.20 = 14.2

ErrorPAD = (ObservedPAD – ExpectedPAD)2/ExpectedPAD

 

Severe (GCS in 3, 4, 5, 6, 7, 8)

 

ObservedSevere

ExpectedSevere = n*PSevere = 71*.50 = 35.5

ErrorSevere = (ObservedSevere – ExpectedSevere)2/ExpectedSevere

 

Moderate (GCS in 9, 10, 11, 12)

 

ObservedModerate

ExpectedModerate = n*PModerate = 71*.20 = 14.2

ErrorModerate = (ObservedModerate – ExpectedModerate)2/ExpectedModerate

 

Mild (GCS in 13, 14, 15)

 

ObservedMild

ExpectedMild = n*PMild = 71*.10 = 7.1

ErrorMild = (ObservedMild – ExpectedMild)2/ExpectedMild

 

Then compute Total Error = ErrorPAD + ErrorSevere + ErrorModerate + ErrorMild

Repeating these calculations for each member sample of the FoS yields a Family of Errors (FoE).

If the population proportions for TBI Severity are PPAD=.20,  PSevere =.50, PModerate = .20 and PMild = .10, then between 17.5% and 20% of the member samples of the FoS yield errors as severe or more extreme than our single computed error. Our sample does not seem to present significant evidence against the null hypothesis