Key | The 3rd Hourly | Math 1107
| Fall 2009
Protocol
You will use only the
following resources: Your individual calculator; individual tool-sheet (one (1)
8.5 by 11 inch sheet), writing utensils, blank paper
(provided by me) and this copy of the hourly. Do not share these
resources with anyone else. Show complete detail
and work for full credit. Follow case
study solutions and sample hourly keys in presenting your solutions. Work
all four cases. Using only one side of the blank sheets provided, present
your work. Do not write on both sides of the sheets provided, and present your
work only on these sheets. Do not share information with any other students
during this hourly.
When you
are finished: Prepare a Cover Sheet: Print your name on an otherwise blank
sheet of paper. Then stack your stuff as follows: Cover Sheet (Top),
Your Work Sheets, The Test Papers, Your Toolsheet. Then hand all of this in to me.
Sign and
Acknowledge: I agree to follow
this protocol.
________________________________________________________________________
Name
(PRINTED)
Signature
Date
Case One | Confidence Interval: Population Proportion | Glioblastoma Multiforme
Glioblastoma multiforme
(GBM) is the highest grade glioma
tumor and is the most malignant form of astrocytomas.
These tumors originate in the brain. GBM tumors grow rapidly, invade nearby
tissue and contain cells that are very malignant. GBM are among the most common
and devastating primary brain tumors in adults.
Suppose that we have a
random sample of GBM patients, with survival time (in months) listed below:
0, 1, 2, 2, 3 | 3, 3, 3, 4, 4 | 4, 4, 4, 5, 5 | 5, 5,
5, 5, 6 | 6, 6, 7, 7, 8 | 8, 8, 9, 10, 10 | 11, 11, 11, 12, 12
12, 13, 13, 13, 14 | 14, 15, 16, 17, 17| 18, 18, 19,
23, 24 | 24, 25, 27, 30, 36 | 38, 40, 58, 60, 61
Consider the proportion of GBM patients who
survive strictly less than 24 months. Compute and interpret a 95% confidence
interval for this population proportion. Show your work. Completely discuss and interpret your test
results, as indicated in class and case study summaries.
0, 1, 2, 2, 3 | 3, 3, 3, 4, 4 | 4, 4, 4,
5, 5 | 5, 5, 5, 5, 6 | 6, 6, 7, 7, 8 | 8, 8, 9, 10, 10 | 11, 11, 11, 12, 12
12, 13, 13, 13, 14 | 14, 15, 16, 17, 17|
18, 18, 19, 23, 24 |
24, 25, 27, 30, 36 | 38, 40, 58, 60, 61
n e p
Z sdp lower95
upper95
60 49
0.81667 2 0.049954
0.71676 0.91657
event = GBM patient survives strictly less than 24 months
e = event count = 49
n = sample size = 60
p = sample proportion of GBM patients surviving strictly less than
24months = 49/60 = .81667
sdp = square root of (p*(1 – p)/n ) = sqrt( (.81667*.18333)/60) )
= 0.049954
from 2.00 0.022750 0.95450, Z=2
lower95 = p – (Z*sdp) = .81667 – (2*0.049954) = 0.71676
upper95 = p + (Z*sdp) = .81667 + (2*0.049954) = 0.91657
Our population consists of patients who
have been diagnosed with and who have died with glioblastoma
multiforme (GBM). Our population proportion is the
population proportion of GBM patients surviving strictly less than 24 months.
Each member of the family of samples (FoS) is a single random sample of 60 patients who have been
diagnosed with and who have died with glioblastoma multiforme (GBM). The FoS
consists of all possible samples of this type.
From each member of the (FoS), compute:
e = sample count of GBM patients surviving strictly
less than 24 months
p = sample proportion of GBM patients surviving strictly
less than 24 months = e/60
sdp = square root of (p*(1 – p)/n )
from 2.00 0.022750 0.95450, Z=2
and
then compute the interval as: lower95 =
p – (2*sdp), upper95 = p + (2*sdp).
Computing this interval for each member of
the FoS forms a family of intervals (FoI).
Approximately 95% of the FoI captures the true population proportion of GBM patients
survivning strictly less than 24 months. If our
interval resides in this 95% supermajority, then between 71.7% and 91.6% of GBM
patients survive strictly less than 24 months past diagnosis.
Case Two | Hypothesis Test – Population
Median | Glioblastoma Multiforme
Using the Glioblastoma multiforme (GBM)
data from Case One, test the following: null (H0): The
median GBM survival time is 20 months (h = 20)
against the alternative (H1): h < 20.
Show your work.
Completely discuss and interpret your test results, as indicated in class and
case study summaries.
Null Hypothesis: Median Survival Time = 20
Months
Alternative
Hypothesis: Median Survival Time < 20 Months (Guess is too Large)
Error Function: Number of Sample Patients
Surviving Strictly Less Than 20 Months
0, 1, 2, 2, 3 | 3, 3, 3, 4, 4 | 4, 4, 4, 5,
5 | 5, 5, 5, 5, 6 | 6, 6, 7, 7, 8 | 8, 8, 9, 10, 10 | 11, 11, 11, 12, 12
12, 13, 13, 13, 14 | 14, 15, 16, 17, 17| 18,
18, 19, ||||20||||,
23, 24 | 24, 25, 27, 30, 36 | 38, 40, 58, 60, 61
sample error = Number of Sample Patients
Surviving Strictly Less Than 20 Months = 48
n = sample size = 60
from 60 48 <0.00001,
p < 0.00001 < .01 (p is strictly less than 1/100,000)
Our population consists of patients who
have been diagnosed with and who have died with glioblastoma
multiforme (GBM). Our null hypothesis is that the
population median survival time for this population is 20 months.
Each member of the family of samples (FoS) is a single random sample of 60 patients who have been
diagnosed with and who have died with glioblastoma multiforme (GBM). The FoS
consists of all possible samples of this type.
From each member of the (FoS), compute an error as the number of sample GBM patients
surviving strictly less than 20 months. Computing this error for each member of
the FoS forms a family of errors (FoE).
If the true population median survival time
for GBM patients is 20 months, then fewer than .001% of member samples from the FoS
yield errors as bas as or worse than our error. The
sample presents highly significant evidence against the null hypothesis.
Case Three | Confidence
Interval: Population Mean | Gestational Age
Gestational age is the time spent between conception
and birth, usually measured in weeks. In general, infants born after 36 or
fewer weeks of gestation are defined as premature, and may face significant
challenges in health and development. Infants born after 37-40 weeks of
gestation are generally viewed as full term, and those born after 41 or more
weeks of gestation are generally viewed as post term. Suppose that a random
sample of 2006 US resident live born infants yields the following gestational
ages (in weeks):
31, 31, 32, 32, 33, 34, 34,
34, 35, 35, 36, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 38, 38,
38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 41, 41, 41, 41, 42, 42,
42, 42, 42, 44, 47
Estimate the population
mean gestational age for 2006 US resident live born infants with 94%
confidence. That is, compute and
discuss a 94% confidence interval for this population mean. Provide concise and
complete details and discussion as demonstrated in the case study summaries.
n m sd se
Z lower94 upper94
72 38.25 2.83713
0.33436 1.9 37.6147
38.8853
from 1.90 0.028717 0.94257, Z = 1.9;
se = sd/sqrt(n)
= 2.83713/sqrt(72) = 0.33436;
lower94 = m – (z*se) = 38.25 – (1.9*0.33436) = 37.6147
upper94 = m + (z*se) = 38.25 + (1.9*0.33436) = 38.8853
Our population consists of US resident live
births occurring in 2006. Our population mean is the population mean
gestational age in weeks.
Each member of the family of samples (FoS) is a single random sample of 72 year 2006 US resident
live born infants.. The FoS consists of all possible
samples of this type.
From each member of the (FoS), compute:
m = sample mean gestational age in weeks
sd = sample standard deviation for the sample mean
gestational age in weeks
se = sample standard error = sd/sqrt(72)
from 1.90 0.028717 0.94257, Z = 1.9;
and
then compute the interval as: lower94 = m – (1.9*se),
upper94 = m + (1.9*se).
Computing this interval for each member of
the FoS forms a family of intervals (FoI).
Approximately 94% of the FoI captures the true population mean gestational age for
year 2006 US resident live born infants. If our interval resides in this 94%
supermajority, then the population mean gestational age for year 2006 US
resident live born infants is between 37.6 and 38.8 weeks.
Case Four | Hypothesis Test: Categorical Goodness of Fit | Gestational Age
Gestational age is the time spent
between conception and birth, usually measured in weeks. In general, infants
born after 36 or fewer weeks of gestation are defined as premature,
and may face significant challenges in health and development. Infants born
after 37-40 weeks of gestation are generally viewed as full term,
and those born after 41 or more weeks of gestation are generally viewed as post
term. Using the data and context from Case Three, test the
null hypothesis that the 2006 US resident live births are are
distributed as 10% Premature, 80% Full Term and 10% Post Term. Show your work. Completely discuss and
interpret your test results, as indicated in class and case study summaries. Fully discuss the testing procedure and results. This discussion must include a clear discussion of
the population and the null hypothesis, the family of samples, the family of
errors and the interpretation of the p-value.
Prematurity (< 37 weeks): 31, 31, 32, 32, 33| 34, 34, 34, 35,
35| 36 (11)
Full Term (37-40 weeks): 37,
37, 37, 37, 37| 37, 37, 37, 37, 37| 37, 37, 37, 37, 38| 38, 38, 38, 38, 38|
38, 38, 38, 38, 38| 38, 38, 38, 39, 39| 39, 39, 39, 39, 39| 39, 40,
40, 40, 40| 40, 40, 40, 40, 40|
40, 40, 40, 40, 40| (50)
Post Term (41+ weeks): 41, 41, 41, 41, 42 | 42, 42, 42, 42, 44 |
47 (11)
Total = 11 + 50
+ 11 = 61 + 11 = 72
Expected Counts
from the Null Hypothesis for n=72
ePremature = 72*PPremature =
72*.10 = 7.2
eFull = 72*PFull = 72*.80 = 57.6
ePost = 72*PPost = 72*.10 = 7.2
Error
Calculations
errorPremature = (nPremature
– ePremature )2/ ePremature = (11 – 7.2 )2/ 7.2 = 2.00556
errorFull = (nFull – eFull )2/ eFull
= (50 – 57.6 )2/ 57.6 = 1.00278
errorPost = (nPost – ePost )2/ ePost
= (11 – 7.2 )2/ 7.2 = 2.00556
errorTotal = errorPremature
+ errorFull
+ errorPost
= 2.00556 + 1.00278
+ 2.00556 = 5.013898
From 3 4.8159 0.090
and 3 5.0515 0.080, .08 < p < .09 – the p-value is between 8% and 9%
Interpretation
Our
population consists of US resident live births occurring in 2006. Our categories are based on gestational
age: Prematurity(<37 weeks), Full Term(37-41 weeks) and Post Term (42 or
more weeks). Our null hypothesis is
that the categories are distributed as: 15% Prematurity, 82% Full Term and 3%
Post Term.
Our
Family of Samples (FoS)
consists of every possible random sample of 72 US resident live births
occurring in 2006. Under the null
hypothesis, within each member of the FoS, we
expect approximately:
ePremature = 72*PPremature =
72*.10 = 7.2
eFull = 72*PFull = 72*.80 = 57.6
ePost = 72*PPost = 72*.10 = 7.2
From each member sample of the FoS, we compute sample counts and errors for each level of
survival:
errorPremature = (nPremature
– ePremature )2/ ePremature
errorFull = (nFull – eFull )2/ eFull
errorPost = (nPost – ePost )2/ ePost
Then add the individual errors for the
total error as errorTotal = errorPremature
+ errorFull
+ errorPost.
Computing this error for each member sample
of the FoS, we obtain a Family of Errors (FoE).
If
the gestational age categories for US resident live births occurring in 2006
are distributed as 10% Prematurity, 80%
Full Term and 10% Post Term, then between 8% and 9% of the member
samples of the Family of Samples yields errors as large as or larger than that
of our single sample. Our sample does not present significant evidence against
the null hypothesis.
Work all four (4)
cases.
Table 1. Means and Proportions
Z(k)
PROBRT PROBCENT 0.05 0.48006
0.03988 0.10 0.46017
0.07966 0.15 0.44038
0.11924 0.20 0.42074
0.15852 0.25 0.40129
0.19741 0.30 0.38209
0.23582 0.35 0.36317
0.27366 0.40 0.34458
0.31084 0.45 0.32636
0.34729 0.50 0.30854
0.38292 0.55 0.29116
0.41768 0.60 0.27425
0.45149 0.65 0.25785
0.48431 0.70 0.24196
0.51607 0.75 0.22663
0.54675 0.80 0.21186
0.57629 0.85 0.19766
0.60467 0.90 0.18406
0.63188 0.95 0.17106
0.65789 1.00 0.15866
0.68269 |
Z(k)
PROBRT PROBCENT 1.05 0.14686
0.70628 1.10 0.13567
0.72867 1.15 0.12507
0.74986 1.20 0.11507
0.76986 1.25 0.10565
0.78870 1.30 0.09680
0.80640 1.35 0.088508
0.82298 1.40 0.080757
0.83849 1.45 0.073529
0.85294 1.50 0.066807
0.86639 1.55 0.060571
0.87886 1.60 0.054799
0.89040 1.65 0.049471
0.90106 1.70 0.044565
0.91087 1.75 0.040059
0.91988 1.80 0.035930
0.92814 1.85 0.032157
0.93569 1.90 0.028717
0.94257 1.95 0.025588
0.94882 2.00 0.022750
0.95450 |
Z(k)
PROBRT PROBCENT 2.05 0.020182
0.95964 2.10 0.017864
0.96427 2.15 0.015778
0.96844 2.20 0.013903
0.97219 2.25 0.012224
0.97555 2.30 0.010724
0.97855 2.35 0.009387
0.98123 2.40 0.008198
0.98360 2.45 0.007143
0.98571 2.50 0.006210
0.98758 2.55 0.005386
0.98923 2.60 0.004661
0.99068 2.65 0.004025
0.99195 2.70 .0034670
0.99307 2.75 .0029798
0.99404 2.80 .0025551
0.99489 2.85 .0021860
0.99563 2.90 .0018658
0.99627 2.95 .0015889
0.99682 3.00 .0013499
0.99730 |
Table 2. Medians
n error base p-value 60
0 1.00000 60 1
1.00000 60 2
1.00000 60 3
1.00000 60 4
1.00000 60 5
1.00000 60 6
1.00000 60 7
1.00000 60 8
1.00000 60 9
1.00000 60 10
1.00000 60 11
1.00000 60 12
1.00000 60 13
0.99999 60 14
0.99998 60 15
0.99993 60 16
0.99980
60 17
0.99947 60 18
0.99866 60 19
0.99689 60 20
0.99326 |
n error base p-value 60
21 0.98633 60 22
0.97405 60 23
0.95377 60 24
0.92250 60 25
0.87747 60 26
0.81685 60 27
0.74052 60 28
0.65056 60 29
0.55129 60 30
0.44871 60 31
0.34944 60 32
0.25948 60 33
0.18315 60 34
0.12253 60 35
0.07750 60 36
0.04623 60 37
0.02595
60 38
0.01367 60 39
0.00674 60 40
0.00311 |
n error base p-value 60
41 0.00134 60 42
0.00053 60 43
0.00020 60 44
0.00007 60 45
0.00002 60 46
0.00001 60 47
<0.00001 60 48
<0.00001 60 49
<0.00001 |
Table 3. Categories/Goodness of Fit
Categories
ERROR p-value 3 0.0000 1.000 3 0.2107
0.900 3 0.4463 0.800 3 0.7133
0.700
3 1.0217 0.600 3 1.3863
0.500
3 1.5970 0.450 3 1.8326 0.400 3 2.0996
0.350
3 2.4079 0.300 3 2.7726
0.250
3 3.2189 0.200 3 4.6052
0.100
3 4.8159 0.090 3 5.0515
0.080
3 5.3185 0.070 3 5.6268
0.060
3 5.9915 0.050 3 6.4378
0.040
3 7.0131 0.030 3 7.8240
0.020
3 9.2103 0.010 |
Categories
ERROR p-value 4 0.0000 1.000 4 0.5844 0.900 4 1.0052 0.800 4 1.4237 0.700 4 1.8692 0.600 4 2.3660 0.500 4 2.6430 0.450 4 2.9462 0.400 4 3.2831 0.350 4 3.6649 0.300 4 4.1083 0.250 4 4.6416 0.200 4 4.9566 0.175 4 5.3170 0.150 4 5.7394 0.125 4 6.2514 0.100 4 6.4915 0.090 4 6.7587 0.080 4 7.0603 0.070 4 7.4069 0.060 4 7.8147 0.050 4 8.3112 0.040 4 8.9473 0.030 4 9.8374 0.020 4 11.3449 0.010 |
Categories
ERROR p-value 5 0.0000 1.000 5 1.0636 0.900 5 1.6488 0.800 5 2.1947 0.700 5 2.7528 0.600 5 3.3567 0.500 5 3.6871 0.450 5 4.0446 0.400 5 4.4377 0.350 5 4.8784 0.300 5 5.3853 0.250 5 5.9886 0.200 5 6.3423 0.175 5 6.7449 0.150 5 7.2140 0.125 5 7.7794 0.100 5 8.0434 0.090 5 8.3365 0.080 5 8.6664 0.070 5 9.0444 0.060 5 9.4877 0.050 5 10.0255 0.040 5 10.7119 0.030 5 11.6678 0.020 5 13.2767 0.010 |