11^th November 2009

Categorical Goodness of Fit

Hypothesis Testing: Goodness of Fit

One Last Coin

Be able to perform the goodness-of-fit test for categorical data.

Be able to fully discuss the testing process and results. This discussion must include a clear discussion of the population and the null hypothesis, the categories of data, the family of samples, the family of errors and the interpretation of the p-value.

500 tosses of our coin yields the following:

260 Blue;

240 Green.

Solution:

Identify each category of classification for our Coin.

We have two categories: Blue and Green.

Test the null hypothesis that the Coin is Fair. Follow the steps:

Goodness-of-Fit Test

The purpose of this Goodness of Fit test is to evaluate the evidence in a random sample against a proposed categorical breakdown of a population.

We have two categories: Blue and Green.

Our population consists of all possible coin tosses.

Each member of our Family of Samples (FoS) is a single sample of n=500 tosses of our coin. FoS consists of all samples of this type.

Compute the following: sample size(n),counts for each category(n_i). Identify each category of classification.

There are n=500 coin tosses in our sample, with:

n_blue = 260 and n_green = 240.

We have two categories: Blue and Green.

Null Hypothesis Note the probability(p_i) for each category specified under the null hypothesis. Compute the expected count for each category, as e_i = np_i .

Our Null Hypothesis is that the coin is fair; this requires:

p_blue = 0.50 and p_green = 0.50 . We then compute:

e_blue = 500*0.50 = 250 and e_green = 500*0.50 = 250

These are the expected counts for a sample of size 500, given the truth of the Null Hypothesis.

The Error Estimate Compute the error term for each category of data as ( (n_i - np_i)² ) / (np_i). Add the error terms together for a total error term.

error_blue = (260-250)² / 250 = 100/250 = .40 and

error_green = (240-250)² / 250 = 100/250 = .40 . The total error is then:

error = .40 + .40 = .80 .

P-Value/Table Consult a Chi-Square Table. Obtain the approximate p-value.

We need the following rows from the pearson table:

2 0.70833 0.400

2 0.87346 0.350

Since our error is between .70833 and .87346, our p-value is somewhere between 35% and 40%.

Discussion/Interpretation

Population and Sampling Clearly identify the population and the population category proportions. Describe the family of samples.

We have two categories: Blue and Green.

Our population consists of all possible coin tosses.

p_blue is the probability of observing a blue face.

p_green is the probability of observing a green face.

Each member of our Family of Samples (FoS) is a single sample of n=500 tosses of our coin. FoS consists of all samples of this type.

Family of Errors and P-Value Describe the family of samples and how each error is computed. Apply the p-value to the family of errors.

Each member sample of FoS yields an error as:

error_blue = (n_blue - e_blue )² / e_blue and

error_green = (n_green - e_green )² / e_green.

The total error is then: error = error_blue + error_green.

Since our error is between .70833 and .87346, our p-value is somewhere between 35% and 40%.

So if the coin is really fair, then something between 35% and 40% of the Family of Samples yield errors equal to or worse than ours.

So the sample doesn't present significance evidence against the Null Hypothesis.

Hypothesis Testing

Goodness of Fit

Color Bowl Reduit II

We have an actual bowl, filled with blue, purple, red and yellow chips. We will use a sample of n=50 draws with replacement from the color bowl.

We think that the colors in the bowl might be equally distributed. Use the sample to test this hypothesis.

Sample I

color sample count

yellow 3

green 11

blue 21

red 15

Sample II

color sample count

yellow 5

green 11

blue 20

red 14

Sample III

color sample count

yellow 5

green 6

blue 29

red 10

Solution:

Test Results

Sample I

Obs color count expected error totalerror

1 yellow 3 12.5 7.22 7.22

2 green 11 12.5 0.18 7.40

3 blue 21 12.5 5.78 13.18

4 red 15 12.5 0.50 13.68

Sample II

Obs color count expected error totalerror

1 yellow 5 12.5 4.50 4.50

2 green 11 12.5 0.18 4.68

3 blue 20 12.5 4.50 9.18

4 red 14 12.5 0.18 9.36

Sample III

Obs color count expected error totalerror

1 yellow 5 12.5 4.50 4.50

2 green 6 12.5 3.38 7.88

3 blue 29 12.5 21.78 29.66

4 red 10 12.5 0.50 30.16

Identify each category of classification for our Coin.

We have four categories: Blue, Green, Yellow and Red.

Test the null hypothesis that the colors are equally likely. Follow the steps:

Goodness-of-Fit Test

The purpose of this Goodness of Fit test is to evaluate the evidence in a random sample against a proposed categorical breakdown of a population.

We have four categories: Blue, Green, Yellow and Red.

Our population consists of all possible draws with replacement from the bowl.

Each member of our Family of Samples (FoS) is a single sample of n=50 draws with replacement from our bowl. FoS consists of all samples of this type.

Null Hypothesis Note the probability(p_i) for each category specified under the null hypothesis. Compute the expected count for each category, as e_i = np_i .

Our Null Hypothesis is that colors are equally likely; this requires:

p_blue = .25

p_green = .25

p_red = .25 and

p_yellow = .25 .

We then compute:

e_blue = 50* p_blue = 12.5

e_green = 50* p_green = 12.5

e_yellow = 50* p_yellow = 12.5 and

e_red = 50* p_red = 12.5 .

These are the expected counts for a sample of size 50, given the truth of the Null Hypothesis.

The Error Estimate Compute the error term for each category of data as ( (n_i - np_i)² ) / (np_i). Add the error terms together for a total error term.

error_blue = (n_blue- e_blue)² / e_blue

error_green = (n_green- e_green)² / e_green

error_red = (n_red- e_red)² / e_red

error_yellow = (n_yellow- e_yellow)² / e_yellow

. The total error is then:

error = error_blue + error_green + error_red + error_yellow

P-Value/Table Consult a Chi-Square Table. Obtain the approximate p-value.

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Sample	Total Error	P-Value
I	13.68	<.01
II	9.36	Between .02 and .03
III	30.16	<.01

Discussion/Interpretation

Population and Sampling Clearly identify the population and the population category proportions. Describe the family of samples.

We have four categories: Blue, Green, Yellow and Red.

Our population consists of all possible draws with replacement from our bowl.

p_blue is the probability of observing a blue chip.

p_green is the probability of observing a green chip.

p_yellow is the probability of observing a yellow chip.

p_red is the probability of observing a red chip.

Each member of our Family of Samples (FoS) is a single sample of n=50 draws with replacement from our bowl. FoS consists of all samples of this type.

Family of Errors and P-Value Describe the family of samples and how each error is computed. Apply the p-value to the family of errors.

Each member sample of FoS yields an error as:

error_blue = (n_blue - e_blue )² / e_blue and

error_green = (n_green - e_green )² / e_green

error_yellow = (n_yellow - e_yellow )² / e_yellow

error_red = (n_red - e_red )² / e_red.

The total error is then:

error = error_blue + error_green+ error_yellow + error_red

For sections 06 and 08, our p-value is less than 1%. For section 07, our p-value is between 2% and 3%.

The conditional probability of obtaining samples as bad as or worse than the errors obtained in sections 06 or 08 given equally distributed colors is less than 1%. The conditional probability of obtaining samples as bad as or worse than the sample obtained in section 07 is between 2% and 3%.

Samples in all sections present significant evidence against the Null Hypothesis.

From: http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionSkey.htm

Case Six

Hypothesis Test – Categorical Goodness-of-Fit

LeRoy’s Sarcoma and Survival Time Categories

We are studying survival time in patients with LeRoy’s Sarcoma (LS), an entirely fictitious disease. We track the survival time of entirely fictitious patients who are diagnosed with LS. Survival times are grouped as follows: Very Short Survival: 6 weeks or less; Abbreviated Survival: 7 weeks to 12 weeks; Regular Survival: 13 weeks to 72 weeks and Long Term Survival: 73 or more weeks. We wish to evaluate the following model for survival time in patients with LeRoy’s Sarcoma: Pr{Very Short Term Survival: (6 weeks or less)} = .25, Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and Pr{ Long Term Survival: (73 weeks or more weeks)} = .25. Consider a sample of patients with LeRoy’s Sarcoma (LS) with these survival times (in weeks): 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 8, 10, 12, 13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70, 80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750. Test the Hypothesis that the survival times for patients with LeRoy’s Sarcoma are distributed as indicated in the probability model. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of the population and the null hypothesis, the family of samples, the family of errors and the interpretation of the p-value.

Numbers

Very Short Survival: 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6

observed = o = 20

expected = e = 50*.25 = 12.5

error = (o-e)²/e = (20-12.5)²/12.5 ≈ 4.5

Abbreviated Survival: 7, 7, 8, 10, 12

o = 5

e=50*.05=2.5

error = (o-e)²/e = (5-2.5)²/2.5 = 2.5

Regular Survival: 13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70

o = 14

e=50*.45=22.5

error = (o-e)²/e = (14-22.5)²/22.5 ≈ 3.21111

Long Term Survival: 80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750

o = 11

e=50*.25=12.5

error = (o-e)²/e = (11-12.5)²/12.5 ≈ 0.18000

Total Error = 4.5 + 2.5 + 3.21111 + 0.18000 ≈ 10.3911 over 4 categories

From 4 11.3449 0.010 and 4 9.8374 0.020, .01 < p-value < .02

Interpretation:

Population: Cases of LeRoy’s Sarcoma

Population Proportions: Very Short Term Survival: (6 weeks or less) = .25, Abbreviated Survival: (7 weeks to 12 weeks) = .05, Regular Survival: (13 weeks to 72 weeks) = .45 and Long Term Survival: (73 weeks or more) = .25

Family of Samples: Each member is a single random sample of 50 cases with LeRoy’s Sarcoma.

For each member of the FoS, compute:

Total Error = {(expected_{ST <6 Weeks}-observed_{ST <6 Weeks})²/expected_{ST
<6 Weeks}}+{(expected_{ST 7-12 Weeks}-observed_{ST 7-12
Weeks})²/expected_{ST 7-12 Weeks}}+

{(expected_{ST
13-72 Weeks}-observed_{ST 13-72 Weeks})²/expected_{ST
13-72 Weeks}}+{(expected_{ST 73+ Weeks}-observed_{ST 73+ Weeks})²/expected_{ST
73+ Weeks}}.

Doing so for every member of the FoS yields the Family of Errors.

If the null proportions hold for LeRoy’s Sarcoma survival times, then the probability of getting a sample as bad or worse than our sample is between 1% and 2%. This sample seems to present highly significant evidence against the Null Hypothesis. We reject the Null Hypothesis at 5% and at 1% significance.

Table 3. Categories/Goodness of Fit

Categ

ories ERROR p-value

4 0.0000 1.000

4 0.5844 0.900

4 1.0052 0.800

4 1.4237 0.700

4 1.8692 0.600

4 2.3660 0.500

4 2.6430 0.450

4 2.9462 0.400

4 3.2831 0.350

4 3.6649 0.300

4 4.1083 0.250

4 4.6416 0.200

4 4.9566 0.175

4 5.3170 0.150

4 5.7394 0.125

4 6.2514 0.100

4 6.4915 0.090

4 6.7587 0.080

4 7.0603 0.070

4 7.4069 0.060

4 7.8147 0.050

4 8.3112 0.040

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Categ

ories ERROR p-value

5 0.0000 1.000

5 1.0636 0.900

5 1.6488 0.800

5 2.1947 0.700

5 2.7528 0.600

5 3.3567 0.500

5 3.6871 0.450

5 4.0446 0.400

5 4.4377 0.350

5 4.8784 0.300

5 5.3853 0.250

5 5.9886 0.200

5 6.3423 0.175

5 6.7449 0.150

5 7.2140 0.125

5 7.7794 0.100

5 8.0434 0.090

5 8.3365 0.080

5 8.6664 0.070

5 9.0444 0.060

5 9.4877 0.050

5 10.0255 0.040

5 10.7119 0.030

5 11.6678 0.020

5 13.2767 0.010

Categ

ories ERROR p-value

6 0.0000 1.000

6 1.6103 0.900

6 2.3425 0.800

6 2.9999 0.700

6 3.6555 0.600

6 4.3515 0.500

6 4.7278 0.450

6 5.1319 0.400

6 5.5731 0.350

6 6.0644 0.300

6 6.6257 0.250

6 7.2893 0.200

6 7.6763 0.175

6 8.1152 0.150

6 8.6248 0.125

6 9.2364 0.100

6 9.5211 0.090

6 9.8366 0.080

6 10.1910 0.070

6 10.5962 0.060

6 11.0705 0.050

6 11.6443 0.040

6 12.3746 0.030

6 13.3882 0.020

6 15.0863 0.010

From: http://www.mindspring.com/~cjalverson/3rdhourlyspring2008versionBkey.htm

Case Three | Categorical Goodness of Fit | Angry Barrels of Monkeys

A company, BarrelCorpÔ manufactures barrels and wishes to ensure the strength and quality of its barrels. Chimpanzees traumatized the company owner as a youth; so the company uses the following test (Angry_Barrel_of_Monkeys_Test) of its barrels:

Ten (10) chimpanzees are loaded into the barrel. The chimpanzees are exposed to Angry!Monkey!Gas!ä, an agent guaranteed to drive the chimpanzees to a psychotic rage. The angry, raging, psychotic chimpanzees then destroy the barrel from the inside in an angry, raging, psychotic fashion. The survival time, in minutes, of the barrel is noted.

A random sample of 50 BarrelCorpÔ barrels is evaluated using the Angry_Barrel_of_Monkeys_Test, and the survival time (in ***MINUTES***) of each barrel is noted. The survival time of each barrel is listed below:

2 3 3 4 5 8 12 14 16 18

22 23 25 26 27 29 30 32 32 33

34 35 36 37 35 35 36 38 40 42

42 42 43 44 45 45 48 48 49 50

50 72 77 84 88 93 95 97 116 120

An endurance scale is defined as: Really Weak: strictly less than 5 minutes survival time, Weak: [5,15) minutes survival time, Adequate: [15, 30) minutes survival time, Good: [30, 50) minutes survival time and Super Good: 50 or more minutes survival time

Test the hypothesis that the survival times are equally distributed among the five survival categories. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries.

Numbers

E_{Really Weak} = N*P_{Really Weak} = 50*.20 = 10

O_{Really Weak} = 4

Error_{Really Weak} = (O_{Really Weak} ─ E_{Really Weak})²/ E_{Really Weak} = (4 ─ 10)²/ 10 » 3.6

E_Weak = N*P_Weak = 50*.20 = 10

O_Weak = 4

Error_Weak = (O_Weak ─ E_Weak)²/ E_Weak = (4 ─ 10)²/ 10 » 3.6

E_Adequate = N*P_Adequate = 50*.20 = 10

O_Adequate = 8

Error_Adequate = (O_Adequate ─ E_Adequate)²/ E_Adequate = (8 ─ 10)²/ 10 » 0.4

E_Good = N*P_Good = 50*.20 = 10

O_Good = 23

Error_Good = (O_Good ─ E_Good)²/ E_Good = (23 ─ 10)²/ 10 » 16.9

E_{Super Good} = N*P_{Super Good} = 50*.20 = 10

O_{Super Good} = 11

Error_{Super Good} = (O_{Super Good} ─ E_{Super Good})²/ E_{Super Good} = (11 ─ 10)²/ 10 » 0.10

Total Error = Error_{Really Weak} + Error_Weak + Error_Adequate + Error_Good + Error_{Super Good} = 3.6 + 3.6 + 0.4 + 16.9 + 0.10 = 24.60 over 5 categories. From 5 13.2767 0.010, p<.01 since total error exceeds 13.2767.

Interpretation

Our population is the population of BarrelCorpÔ barrels.Our categories are based on an endurance scale of survival under the Angry Barrel of Monkeys Test: Really Weak: strictly less than 5 minutes survival time, Weak: [5,15) minutes survival time, Adequate: [15, 30) minutes survival time, Good: [30, 50) minutes survival time and Super Good: 50 or more minutes survival time. Our null hypothesis is that the categories are equally likely: 20% Really Weak 20% Weak, 20% Adequate, 20% Good and 20% Super Good.

Our Family of Samples (FoS) consists of every possible random sample of 50 BarrelCorpÔ barrels. Under the null hypothesis, within each member of the FoS, we expect approximately 12.5 barrels per survival category:

E_{Really Weak} = N*P_{Really Weak} = 50*.20 = 10

E_Weak = N*P_Weak = 50*.20 = 10

E_Adequate = N*P_Adequate = 50*.20 = 10

E_Good = N*P_Good = 50*.20 = 10

E_{Super Good} = N*P_{Super Good} = 50*.20 = 10

From each member sample of the FoS, we compute sample counts and errors for each level of survival:

E_{Really Weak} = N*P_{Really Weak} = 50*.20 = 10

Error_{Really Weak} = (O_{Really Weak} ─ E_{Really Weak})²/ E_{Really Weak}

E_Weak = N*P_Weak = 50*.25 = 12.5

Error_Weak = (O_Weak ─ E_Weak)²/ E_Weak

E_Adequate = N*P_Adequate = 50*.25 = 12.5

Error_Adequate = (O_Adequate ─ E_Adequate)²/ E_Adequate

E_Good = N*P_Good = 50*.25 = 12.5

Error_Good = (O_Good ─ E_Good)²/ E_Good

E_{Super Good} = N*P_{Super Good} = 50*.25 = 12.5

Error_{Super Good} = (O_{Super Good} ─ E_{Super Good})²/ E_{Super Good}

Then add the individual errors for the total error. Computing this error for each member sample of the FoS, we obtain a Family of Errors (FoE).

If the survival categories are equally likely, then fewer than 1% of the member samples of the Family of Samples yields errors as large as or larger than that of our single sample. Our sample presents highly significant evidence against the null hypothesis.

Table 3. Categories/Goodness of Fit

Categories ERROR p-value

4 0.0000 1.000

4 0.5844 0.900

4 1.0052 0.800

4 1.4237 0.700

4 1.8692 0.600

4 2.3660 0.500

4 2.6430 0.450

4 2.9462 0.400

4 3.2831 0.350

4 3.6649 0.300

4 4.1083 0.250

4 4.6416 0.200

4 4.9566 0.175

4 5.3170 0.150

4 5.7394 0.125

4 6.2514 0.100

4 6.4915 0.090

4 6.7587 0.080

4 7.0603 0.070

4 7.4069 0.060

4 7.8147 0.050

4 8.3112 0.040

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Categories ERROR p-value

5 0.0000 1.000

5 1.0636 0.900

5 1.6488 0.800

5 2.1947 0.700

5 2.7528 0.600

5 3.3567 0.500

5 3.6871 0.450

5 4.0446 0.400

5 4.4377 0.350

5 4.8784 0.300

5 5.3853 0.250

5 5.9886 0.200

5 6.3423 0.175

5 6.7449 0.150

5 7.2140 0.125

5 7.7794 0.100

5 8.0434 0.090

5 8.3365 0.080

5 8.6664 0.070

5 9.0444 0.060

5 9.4877 0.050

5 10.0255 0.040

5 10.7119 0.030

5 11.6678 0.020

5 13.2767 0.010

Categories ERROR p-value

6 0.0000 1.000

6 1.6103 0.900

6 2.3425 0.800

6 2.9999 0.700

6 3.6555 0.600

6 4.3515 0.500

6 4.7278 0.450

6 5.1319 0.400

6 5.5731 0.350

6 6.0644 0.300

6 6.6257 0.250

6 7.2893 0.200

6 7.6763 0.175

6 8.1152 0.150

6 8.6248 0.125

6 9.2364 0.100

6 9.5211 0.090

6 9.8366 0.080

6 10.1910 0.070

6 10.5962 0.060

6 11.0705 0.050

6 11.6443 0.040

6 12.3746 0.030

6 13.3882 0.020

6 15.0863 0.010

From: http://www.mindspring.com/~cjalverson/CompFinalSummer2008verBkey.htm

Case Four | Goodness of Fit | Y2K GA Res LB Prenatal Care

A random sample of Year 2000 Georgia resident live births are checked for prenatal care status, in the following categories:

Prenatal Care Status	Number in Sample
Prenatal Care Began 1^st Trimester (Months 1-3 of Pregnancy)	415
Prenatal Care Began 2^nd Trimester (Months 4-6 of Pregnancy)	62
Prenatal Care Began 3^rd Trimester (Months 7-9 of Pregnancy)	14
No Prenatal Care (1^st Care at Delivery)	9
Total	500

Our null hypothesis is that the following probability model applies to year 2000 Georgia Resident Live Births is correct:

Prenatal Care Status	Probability
Prenatal Care Began 1^st Trimester (Months 1-3 of Pregnancy)	.75
Prenatal Care Began 2^nd Trimester (Months 4-6 of Pregnancy)	.15
Prenatal Care Began 3^rd Trimester (Months 7-9 of Pregnancy)	.05
No Prenatal Care (1^st Care at Delivery)	.05
Total	1.00

Test this Hypothesis. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries.

Numbers

stage O e p error errorT

1st 415 375 0.75 4.2667 4.2667

2nd 62 75 0.15 2.2533 6.5200

3rd 14 25 0.05 4.8400 11.3600

No 9 25 0.05 10.2400 21.6000

From 4 11.3449 0.010, p < .01

Prenatal Care Began 1^st Trimester (Months 1-3 of Pregnancy)

Observed = 415

Probability from Model = .75

Expected = N*P = 500*.75 = 375

Error = (Observed ─ Expected)²/Expected = (415 ─ 375)²/375 » 4.2667

Prenatal Care Began 2^nd Trimester (Months 4-6 of Pregnancy)

Observed = 62

Probability from Model = .15

Expected = N*P = 500*.15 = 75

Error = (Observed ─ Expected)²/Expected = (62 ─ 75)²/75 = 2.2533

Prenatal Care Began 3^rd Trimester (Months 7-9 of Pregnancy)

Observed = 14

Probability from Model = .05

Expected = N*P = 500*.05 = 25

Error = (Observed ─ Expected)²/Expected = (14 ─ 25)²/25 » 4.84

No Prenatal Care (1^st Care at Delivery)

Observed = 9

Probability from Model = .05

Expected = N*P = 500*.05 = 25

Error = (Observed ─ Expected)²/Expected = (9 ─ 25)²/25 » 10.24

Total Error = Error_1st + Error_2nd + Error_3rd + Error_No »

4.2667 + 2.2533 + 4.84 + 10.24 » 21.6 over four categories.

p-value from row 4 11.3449 0.010, p < .01

Interpretation

Our population consists of year 2000 Georgia resident live born infants.

Each infant’s prenatal care status falls into a single severity category:

Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)

Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)

Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)

No Prenatal Care (1st Care at Delivery).

Our model presents the following probabilities for each category of care:

Pr{Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy)} = .75

Pr{Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy)} = .15

Pr{Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy)} = .05

Pr{No Prenatal Care (1st Care at Delivery)} = .05

Each member of the family of samples is a single random sample of 500 year 2000 Georgia resident live born infants. The family contains all possible samples of this type.

From each member of the family of samples, compute the observed and expected category counts, then compute an error for each category:

Prenatal Care Began 1^st Trimester (Months 1-3 of Pregnancy)

Observed

Probability from Model = .75

Expected = N*P = 500*.75 = 375

Error = (Observed ─ Expected)²/Expected

Prenatal Care Began 2^nd Trimester (Months 4-6 of Pregnancy)

Observed

Probability from Model = .15

Expected = N*P = 500*.15 = 75

Error = (Observed ─ Expected)²/Expected

Prenatal Care Began 3^rd Trimester (Months 7-9 of Pregnancy)

Observed

Probability from Model = .05

Expected = N*P = 500*.05 = 25

Error = (Observed ─ Expected)²/Expected

No Prenatal Care (1^st Care at Delivery)

Observed

Probability from Model = .05

Expected = N*P = 500*.05 = 25

Error = (Observed ─ Expected)²/Expected

Total Error = Error_1st + Error_2nd + Error_3rd + Error_NoPNC 4 over four categories.

If our model for prenatal care status is correct, then less than 1% of the members of the family of samples yield errors as bad as or worse than our error. Our sample presents highly significant evidence against the model.

Table: Categories/Goodness of Fit

Categories ERROR p-value

4 0.0000 1.000

4 0.5844 0.900

4 1.0052 0.800

4 1.4237 0.700

4 1.8692 0.600

4 2.3660 0.500

4 2.6430 0.450

4 2.9462 0.400

4 3.2831 0.350

4 3.6649 0.300

4 4.1083 0.250

4 4.6416 0.200

4 4.9566 0.175

4 5.3170 0.150

4 5.7394 0.125

4 6.2514 0.100

4 6.4915 0.090

4 6.7587 0.080

4 7.0603 0.070

4 7.4069 0.060

4 7.8147 0.050

4 8.3112 0.040

4 8.9473 0.030

4 9.8374 0.020

4 11.3449 0.010

Categories ERROR p-value

5 0.0000 1.000

5 1.0636 0.900

5 1.6488 0.800

5 2.1947 0.700

5 2.7528 0.600

5 3.3567 0.500

5 3.6871 0.450

5 4.0446 0.400

5 4.4377 0.350

5 4.8784 0.300

5 5.3853 0.250

5 5.9886 0.200

5 6.3423 0.175

5 6.7449 0.150

5 7.2140 0.125

5 7.7794 0.100

5 8.0434 0.090

5 8.3365 0.080

5 8.6664 0.070

5 9.0444 0.060

5 9.4877 0.050

5 10.0255 0.040

5 10.7119 0.030

5 11.6678 0.020

5 13.2767 0.010

Categories ERROR p-value

6 0.0000 1.000

6 1.6103 0.900

6 2.3425 0.800

6 2.9999 0.700

6 3.6555 0.600

6 4.3515 0.500

6 4.7278 0.450

6 5.1319 0.400

6 5.5731 0.350

6 6.0644 0.300

6 6.6257 0.250

6 7.2893 0.200

6 7.6763 0.175

6 8.1152 0.150

6 8.6248 0.125

6 9.2364 0.100

6 9.5211 0.090

6 9.8366 0.080

6 10.1910 0.070

6 10.5962 0.060

6 11.0705 0.050

6 11.6443 0.040

6 12.3746 0.030

6 13.3882 0.020

6 15.0863 0.010

From http://www.cjalverson.com/_3rdhourlyspring2009verBkey.htm :

Case Four | Hypothesis Test: Categorical Goodness of Fit | Prenatal Care

A random sample of Year 2000 Georgia resident live births are checked for prenatal care status, in the following categories – consider the portion of the sample reporting prenatal care:

Prenatal Care Status	Number in Sample
Prenatal Care Began 1^st Trimester (Months 1-3 of Pregnancy)	410
Prenatal Care Began 2^nd Trimester (Months 4-6 of Pregnancy)	52
Prenatal Care Began 3^rd Trimester (Months 7-9 of Pregnancy)	12

Test the hypothesis that:

Pr{ Prenatal Care Began 1^st Trimester (Months 1-3 of Pregnancy)} = .75

Pr{ Prenatal Care Began 2^ndTrimester (Months 4-6 of Pregnancy)} = .15

Pr{ Prenatal Care Began 3^rd Trimester (Months 7-9 of Pregnancy)} = .10 .

Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of the population and the null hypothesis, the family of samples, the family of errors and the interpretation of the p-value. Show all work and detail for full credit.

Numbers

Prenatal Care Status	n	P	E=nP	Error
PNC T1	410	0.75	355.5	8.355133615
PNC T2	52	0.15	71.1	5.130942335
PNC T3	12	0.1	47.4	26.43797468
Total Sample with PNC	474	1	474	39.92405063

n=474

Prenatal Starts 1^st Trimester

Observed=410

Expected = n*PT1= 474*.75 » 355.5

Error = (Observed – Expected)²/Expected » (410 – 355.5)²/355.5 » 8.355133615

Prenatal Starts 2^nd Trimester

Observed=52

Expected = n*PT2= 474*.15 » 71.1

Error = (Observed – Expected)²/Expected » (52 – 71.1)²/71.1 » 5.130942335

Prenatal Starts 3^rd Trimester

Observed=12

Expected = n*PT3= 474*.10 » 47.4

Error = (Observed – Expected)²/Expected » (12 – 47.4)²/47.4 » 26.43797468

Total Error » 8.355133615 + 5.130942335 + 26.43797468 » 39.92405063 over three categories.

From row: 3 9.2103 0.010, p-value < .01, since our error exceeds 9.2103.

Interpretation

Our population is the population of Year 2000 Georgia resident live births. Our categories are based on those who reported receiving Prenatal Care and include: (T1)Prenatal Care Began 1st Trimester (Months 1-3 of Pregnancy), (T2)Prenatal Care Began 2nd Trimester (Months 4-6 of Pregnancy) and (T3)Prenatal Care Began 3rd Trimester (Months 7-9 of Pregnancy). Our null hypothesis is that the categories are distributed as: 75% T1, 15% T2 and 10% T3.

Our Family of Samples (FoS) consists of every possible random sample of 474 Year 2000 Georgia resident live births. Under the null hypothesis, within each member of the FoS, we expect approximately:

E_T1 = N*P_T1= 474*.75 » 355.5

E_T2 = N*P_T2= 474*.15 » 71.1

E_T3 = N*P_T3 = 474*.10 » 47.4

From each member sample of the FoS, we compute sample counts and errors for each level of survival:

E_T1 = N*P_T1= 474*.75 » 355.5

Error_T1 = (O_T1 ─ E_T1)²/ E_T1

E_T2 = N*P_T2= 474*.15 » 71.1

Error_T2 = (O_T2 ─ E_T2)²/ E_T2

E_T3 = N*P_T3 = 474*.10 » 47.4

Error_T3 = (O_T3 ─ E_T3)²/ E_T3

Then add the individual errors for the total error as Total Error = Error_T1 + Error_T2 + Error_T3

Computing this error for each member sample of the FoS, we obtain a Family of Errors (FoE).

If the prenatal care categories are distributed as: 75% T1, 15% T2 and 10% T3, then fewer than 1% of the member samples of the Family of Samples yields errors as large as or larger than that of our single sample. Our sample presents highly significant evidence against the null hypothesis.

From http://www.mindspring.com/~cjalverson/_2ndhourlysummer2007versionAkey.htm :

Case Six

Hypothesis Test – Categorical Goodness-of-Fit

LeRoy’s Sarcoma and Survival Time Categories

Very Short Survival: 6 weeks or less;

Abbreviated Survival: 7 weeks to 12 weeks;

Regular Survival: 13 weeks to 72 weeks and

Long Term Survival: 73 or more weeks.

We wish to evaluate the following model for survival time in patients with LeRoy’s Sarcoma:

Pr{Very Short Term Survival: (6 weeks or less)} = .25,

Pr{ Abbreviated Survival: (7 weeks to 12 weeks)} = .05,
Pr{ Regular Survival: (13 weeks to 72 weeks)} = .45, and

Pr{ Long Term Survival: (73 weeks or more weeks)} = .25.

Consider a sample of patients with LeRoy’s Sarcoma (LS) with these survival times (in weeks):

3, 3, 3, 3, 4, | 4, 4, 4, 5, 5 |, 5, 5, 5, 5, 5, | 6, 6, 6, 6, 6, (20)

7, 7, 8, 10, 12, (5)

13, 13, 14, 15, 15, | 16, 17, 18, 23, 34 |, 37, 45, 60, 70, (14)

80, 85, 86, 95, 110, | 135, 150, 185, 253, 350,| 750. (11)

Test the Hypothesis that the survival times for patients with LeRoy’s Sarcoma are distributed as indicated in the probability model. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of the population and the null hypothesis, the family of samples, the family of errors and the interpretation of the p-value.

Numbers

Very Short Survival

3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6

o = 20

e=50*.25=12.5

error = (o-e)²/e = (20-12.5)²/12.5 @ 4.5

Abbreviated Survival

7, 7, 8, 10, 12

o = 5

e=50*.05=2.5

error = (o-e)²/e = (5-2.5)²/2.5 @ 2.5

Regular Survival

13, 13, 14, 15, 15, 16, 17, 18, 23, 34, 37, 45, 60, 70

o = 14

e=50*.45=22.5

error = (o-e)²/e = (14-22.5)²/22.5 @ 3.2111

Long Term Survival

80, 85, 86, 95, 110, 135, 150, 185, 253, 350, 750

o = 11

e=50*.25=12.5

error = (o-e)²/e = (11-12.5)²/12.5 @ 0.18

Total Error = 4.5 + 2.5 + 3.2111 + 0.18 @ 10.3911 over 4 categories

From 4 9.8374 0.020 and 4 11.3449 0.010, .01 < p-value < .02

Interpretation:

Population: Cases of LeRoy’s Sarcoma

Population Proportions: Very Short Term Survival: (6 weeks or less) @ .25, Abbreviated Survival: (7 weeks to 12 weeks) @ .05, Regular Survival: (13 weeks to 72 weeks) @ .45 and Long Term Survival: (73 weeks or more) @ .25

Family of Samples: Each member is a single random sample of 50 cases with LeRoy’s Sarcoma.

For each member of the FoS, compute:

Total Error = {(expected_{ST
<6 Weeks}-observed_{ST <6 Weeks})²/expected_{ST
<6 Weeks}}+{(expected_{ST 7-12 Weeks}-observed_{ST 7-12
Weeks})²/expected_{ST 7-12 Weeks}}+

{(expected_{ST 13-72
Weeks}-observed_{ST 13-72 Weeks})²/expected_{ST
13-72 Weeks}}+{(expected_{ST 73+ Weeks}-observed_{ST 73+ Weeks})²/expected_{ST
73+ Weeks}}.

Doing so for every member of the FoS yields the Family of Errors.

We reject the Null Hypothesis at 5% significance.

From here http://www.cjalverson.com/_compfinalspring2009verMkey.htm :

Case Four | Hypothesis Test – Categorical Goodness-of-Fit | Traumatic Brain Injury

The Glasgow Coma Scale (GCS) is the most widely used system for scoring the level of consciousness of a patient who has had a traumatic brain injury. GCS is based on the patient's best eye-opening, verbal, and motor responses. Each response is scored and then the sum of the three scores is computed. That is,

Augmented Glasgow Coma Scale Categories

Mild = 13, 14, 15

Moderate = 9, 10, 11, 12

Severe/Coma = 3, 4, 5, 6, 7, 8

Pre-admission Death/PAD/DOA = 0

Traumatic brain injury (TBI) is an insult to the brain from an external mechanical force, possibly leading to permanent or temporary impairments of cognitive, physical, and psychosocial functions with an associated diminished or altered state of consciousness. Consider a random sample of patients with TBI, with GCS at initial treatment and diagnosis listed below:

0, 0, 0, 0, 0, | 0, 0, 0, 0, 0, | 0, 0, 0, 0, (14)

3, 3, 3, 3, 3, | 4, 4, 4, 4, 4, | 4, 4, 5, 5, 5, | 5, 6, 6, 6, 6, | 6, 7, 7, 7, 7, |

8, 8, 8, (28)

9, 9, 9, 9, 9, | 9, 9, 9, 10, 10, | 10, 10, 11, 11, 11, | 12, 12, 12, (18)

13, 13, 13, 14, 14, | 14, 14, 14, 14, 15,| 15 (11)

Our null hypothesis is that TBI case outcomes are 20% Pre-admission Deaths, 50% Severe, 20% Moderate and 10% Mild. Test this Hypothesis. Show your work. Completely discuss and interpret your test results, as indicated in class and case study summaries.

Numbers

Pre-admission Death (PAD)

Observed_PAD = 14

Expected_PAD = n*P_PAD= 71*.20 = 14.2

Error_PAD = (Observed_PAD – Expected_PAD)²/Expected_PAD = (14 –14.2)²/14.2 » 0.00282

Severe (GCS in 3, 4, 5, 6, 7, 8)

Observed_Severe = 28

Expected_Severe = n*P = 71*.50 = 35.5

Error_Severe = (Observed_Severe – Expected_Severe)²/Expected_Severe =

(28 –35.5)²/35.5 » 1.58451

Moderate (GCS in 9, 10, 11, 12)

Observed_Moderate = 18

Expected_Moderate = n*P_Moderate = 71*.20 = 14.2

Error_Moderate = (Observed_Moderate – Expected_Moderate)²/Expected_Moderate =

(18 –14.2)²/14.2 » 1.01690

Mild (GCS in 13, 14, 15)

Observed_Mild = 11

Expected_Mild = n*P_Mild = 71*.10 = 7.1

Error_Mild = (Observed_Mild – Expected_Mild)²/Expected_Mild = (11 –7.1)²/7.1 » 2.14225

Total Error = Error_PAD+ Error_Severe + Error_Moderate + Error_Mild » 0.00282 + 1.58451 + 1.01690 + 2.14225 » 4.74648 over 4 categories.

From rows: 4 4.6416 0.200 and 4 4.9566 0.175, .175 £ p £ .200.

Each member of the Family of Samples(FoS) is a random sample of 71 Traumatic Brain Injury patients – the FoS consists of all possible samples of this type.

From each member sample of the FoS, compute the following items at each level of severity:

Pre-admission Death (PAD)

Observed_PAD

Expected_PAD = n*P_PAD= 71*.20 = 14.2

Error_PAD = (Observed_PAD – Expected_PAD)²/Expected_PAD

Severe (GCS in 3, 4, 5, 6, 7, 8)

Observed_Severe

Expected_Severe = n*P_Severe = 71*.50 = 35.5

Error_Severe = (Observed_Severe – Expected_Severe)²/Expected_Severe

Moderate (GCS in 9, 10, 11, 12)

Observed_Moderate

Expected_Moderate = n*P_Moderate = 71*.20 = 14.2

Error_Moderate = (Observed_Moderate – Expected_Moderate)²/Expected_Moderate

Mild (GCS in 13, 14, 15)

Observed_Mild

Expected_Mild = n*P_Mild = 71*.10 = 7.1

Error_Mild = (Observed_Mild – Expected_Mild)²/Expected_Mild

Then compute Total Error = Error_PAD+ Error_Severe + Error_Moderate + Error_Mild

Repeating these calculations for each member sample of the FoS yields a Family of Errors (FoE).

If the population proportions for TBI Severity are P_PAD=.20, P_Severe =.50, P_Moderate = .20 and P_Mild = .10, then between 17.5% and 20% of the member samples of the FoS yield errors as severe or more extreme than our single computed error. Our sample does not seem to present significant evidence against the null hypothesis