Summaries

Session 2.1

24th June 2009

 

Descriptive Statistics

 

TI-83 Notes

 

Making Friends with Your Calculator

http://www.geocities.com/calculatorhelp/ti83

http://www.lrc.edu/mat/ti83_statistics.htm

http://www.math.tamu.edu/~khalman/calculator.htm

http://east.chclc.org/russo/ti3801.htm

http://www.willamette.edu/~mjaneba/help/TI-82-stats.htm

http://faculty.purduenc.edu/jkuhn/courses/previous/workbooks/301/lab1.pdf

http://instruct1.cit.cornell.edu/courses/arme210/TI83.pdf

http://www.math.oregonstate.edu/home/programs/undergrad/TI_Manuals/ti83Guidebook.pdf

 

http://education.ti.com/us/product/tech/83p/guide/83pguideus.html

http://education.ti.com/guidebooks/graphing/84p/TI84PlusGuidebook_Part2_EN.pdf

 

Key Strokes for TI83, TI84

 

Key List

 

Power/ON: Last Key on Left, Bottom Row

STAT: Center Key, 3rd Row

▲►▼◄: Toggle Keys, 2nd and 3rd Rows

ENTER: Enter/Return Key, Last Key on Right, Bottom Row

CLEAR: Clear Key, Last Key on Right, 4th Row

DEL: Delete Key, Center Key, 2nd Row

 

Stroke Lists for Tasks

 

Set Up Data Lists: STAT, ▼▼▼▼, ENTER, ENTER

Clear Primary List L1: STAT, ENTER, ▲, CLEAR, ▼

Edit Primary List L1: STAT, ENTER, Enter Number, then ▼ or ENTER

Calculate Statistics for Primary List L1: STAT, ►, ENTER, ENTER

 

Use Toggle Keys ▲▼to Navigate the Statistics Screens

 

Descriptive Statistics – Symbols

 

n – sample size, number of data points in the sample

mean(m,m) – sample mean, sum of the data points divided by sample size

 

pxxth  percentile, approximately x% of the sample points are at or below px; approximately (100-x)% of the sample points are at or above px.

 

p0 – minimum, 0th percentile, q0 – smallest value for any data point in the sample

p25 –25th percentile, q1 – lower quartile,  approximately 25% of the sample points are at or below p25

p50 – median, 50th percentile q2 – middle quartile,  approximately 50% of the sample points are at or below p50

p75 –75th percentile, q3 – upper quartile,  approximately 75% of the sample points are at or below p75

p100 – maximum, 100th percentile, q4 – largest value for any data point in the sample

 

Ranges and Samples

 

Total Sample, Total Range: range = max – min = q4 – q0 = p100 – p0

 

Upper Three-quarter Sample, Upper Three-quarter Range = q4 – q1 = p100 – p25

Lower Three-quarter Sample, Lower Three-quarter Range = q3 – q0 = p75 – p0

 

Upper Half Sample, Upper Half Range = q4 – q2 = p100 – p50

(IQR)Middle Half Sample, Middle Half Range = q3 – q1 = p75 – p25

Lower Half Sample, Lower Half Range = q2 – q0 = p50 – p0

 

Upper Quarter Sample, Upper Quarter Range = q4 – q3 = p100 – p75

Upper Middle Quarter Sample, Upper Middle Quarter Range = q3 – q2 = p75 – p50

Lower Middle Quarter Sample, Lower Middle Quarter Range = q2 – q1 = p50 – p25

Lower Quarter Sample, Lower Quarter Range = q1 – q0 = p25 – p0

 

Example from http://www.mindspring.com/~cjalverson/_2ndhourlyfall2006versionA_key.htm

 

Case One

Descriptive Statistics

Serum Creatinine and Kidney (Renal) Function

Healthy kidneys remove wastes and excess fluid from the blood. Blood tests show whether the kidneys are failing to remove wastes. Urine tests can show how quickly bdy wastes are being removed and whether the kidneys are also leaking abnormal amounts of protein. The nephron is the basic structure in the kidney that produces urine. In a healthy kidney there may be as many as 1,000,000 nephrons. Loss of nephrons reduces the ability of the kidney to function by reducing the kidney’s ability to produce urine. Progressive loss of nephrons leads to kidney failure. Serum creatinine. Creatinine is a waste product that comes from meat protein in the diet and also comes from the normal wear and tear on muscles of the body. Creatinine is produced at a continuous rate and is excreted only through the kidneys. When renal dysfunction occurs, the kidneys are impaired in their ability to excrete creatinine and the serum creatinine rises. As kidney disease progresses, the level of creatinine in the blood increases.

Suppose that we sample serum creatinine levels in a random sample of adults. Serum creatinine (as mg/dL) for each sampled subject follows:

15.0, 14.5, 14.2, 13.8, 13.5, 13.1, 12.2, 11.1, 10.1, 9.8, 8.1, 7.3, 5.1, 5.0, 4.9, 4.8, 4.0, 3.5, 3.3, 3.2, 3.2, 2.9, 2.5, 2.3, 2.1, 2.0, 1.9, 1.9, 1.8, 1.6, 1.5, 1.5, 1.4, 1.4, 1.3, 1.3, 1.3, 1.2, 1.2, 1.1, 1.12, 1.09, 1.05, 0.95, 0.92, 0.9, 0.9, 0.9, 0.9, 0.8, 0.8, 0.8, 0.8, 0.8, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6

Compute and interpret the following statistics: sample size (n), p00, p25, p50, p75, p100, (p75-p00), (p100-p25), (p75-p50), (p50-p25). Be specific and complete. Show your work, and discuss completely for full credit.

Numbers

 

n=69

p0 = 0.6

p25 = 0.8

p50 = 1.3

p75 = 3.5

p100 = 15.0

 

p75-p0 = 3.5 – 0.6 = 2.9

p100-p25 = 15.0 – 0.8 = 14.2

p75-p50 = 3.5 – 1.3 = 2.2

p50-p25 = 1.3 – 0.8 = 0.5

 

Note: Another acceptable estimate for P75 is 3.75.

 

n=69

p0 = 0.6

p25 = 0.8

p50 = 1.3

p75 = 3.75

p100 = 15.0

 

p75-p0 = 3.75 – 0.6 = 3.15

p100-p25 = 15.0 – 0.8 = 14.2

p75-p50 = 3.75 – 1.3 = 2.45

p50-p25 = 1.3 – 0.8 = 0.5

 

Interpretation

 

There are 69 subjects in the sample. Each subject yields a serum creatinine level.

 

The subject in the sample with the lowest level of serum creatinine has .6 mg creatinine per deciliter serum.

Approximately 25% of the subjects in the sample have .8 or less mg creatinine per deciliter serum.

Approximately 50% of the subjects in the sample have 1.3 or less mg creatinine per deciliter serum.

Approximately 75% of the subjects in the sample have 3.5 or less mg creatinine per deciliter serum.

The subject in the sample with the highest level of serum creatinine has 15.0 mg creatinine per deciliter serum.

Approximately 75% of the subjects in the sample have between 0.6 and 3.5 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this lower three-quarter-sample is 2.9 mg creatinine per deciliter serum.
Approximately 75% of the subjects in the sample have between 0.8 and 15.0 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this upper three-quarter-sample is 14.2 mg creatinine per deciliter serum.
Approximately 25% of the subjects in the sample have between 1.3 and 3.5 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this upper-middle-quarter-sample is 2.2 mg creatinine per deciliter serum.
Approximately 25% of the subjects in the sample have between 0.8 and 1.3 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this lower-middle-quarter-sample is 0.5 mg creatinine per deciliter serum.

 

The Other Ranges

 

p100-p0 = 15.0 – 0.6 = 14.4

100% of the subjects in the sample have between 0.6 and 15.0 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in the total sample is 14.4 mg creatinine per deciliter serum.

 

p100-p50 = 15.0 – 1.3 = 13.7

Approximately 50% of the subjects in the sample have between 1.3 and 15.0 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this upper-half sample is 13.7 mg creatinine per deciliter serum.

 

p75-p25 = 3.50 – 0.8 = 2.70

Approximately 50% of the subjects in the sample have between 0.8 and 3.5 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this middle-half sample is 2.7 mg creatinine per deciliter serum.

 

p50-p0 = 1.3 – 0.6 = 0.70

Approximately 50% of the subjects in the sample have between 0.6 and 1.3 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this lower-half sample is 0.7 mg creatinine per deciliter serum.

 

p100-p75 = 15.0 – 3.5 = 11.5

Approximately 25% of the subjects in the sample have between 3.5 and 15.0 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this upper-quarter sample is 11.5 mg creatinine per deciliter serum.

 

p25-p0 = 0.8 – 0.6 = 0.2

Approximately 25% of the subjects in the sample have between 0.6 and 0.8 mg creatinine per deciliter serum. The largest difference in serum creatinine between any two subjects in this lower-quarter sample is 0.2 mg creatinine per deciliter serum.

 

Example from here: http://www.mindspring.com/~cjalverson/_2nd_Hourly_Spring_2006_Key.htm

 

Case One

Descriptive Statistics

Maternal Body Mass Index (BMI)

 

BMI is defined as the ratio Weight/(Height2), and is one of several measures of body size used in medicine and in public health. Consider a random sample of mothers, US residents, all aged 35 years or older at the time of the pregnancy, whose BMI, measured as kilograms per meter squared (kg/m2) is measured at the beginning of the pregnancy:

 

19.6 25.7 19.8 20.4 22.9 26.6 19.0 30.2 20.7 21.6 21.1 27.5 19.8 23.1 23.2 20.7 23.6 24.2 26.3 42.6 23.9 17.4 20.5 20.8 19.5 21.8 27.4 21.5 17.2 27.5 22.5 19.6 20.5 24.3 24.8 26.6 20.8 24.2 22.5 31.3 22.3 25.1 23.2 20.5 22.7 25.0 23.4 19.5 20.0 20.5

Compute and interpret the following statistics: sample size, p00, p25, p50, p75, p100, (p75-p00), (p75-p25), (p100-p50), (p100-p75).

Numbers     

 

n=50; p0=17.2; p25=20.5; p50=22.5; p75=24.8; p100=42.6; p75 - p0=7.6; p75 - p25=4.3; p100 - p50=20.1;

p100 - p75 = 42.6 - 24.8 = 17.8

 

Discussion

 

n=50: There are 50 mothers in the sample, US residents, all aged 35 years or older at the time of the pregnancy, whose BMI, measured as kilograms per meter squared (kg/m2) is measured at the beginning of the pregnancy.

 

p0=17.2: The mother in the sample with the lowest BMI had an initial BMI of 17.2 kg/m2.

 

p25=20.5: Approximately 25% of the mothers in the sample have initial BMIs of 20.5 kg/m2 or lower.

 

p50=22.5: Approximately 50% of the mothers in the sample have initial BMIs of 22.5 kg/m2 or lower.

 

p75=24.8: : Approximately 75% of the mothers in the sample have initial BMIs of 24.8 kg/m2 or lower.

 

p100=42.6: The mother in the sample with the highest BMI had an initial BMI of 42.6 kg/m2.

 

p75 - p0=7.6: Approximately 75% of the mothers in the sample had initial BMIs between 17.2 and 24.8 kg/m2. The largest possible difference in initial BMI between any two mothers in this lower three-quarter sample is 7.6.

 

p75 - p25=4.3: Approximately 50% of the mothers in the sample had initial BMIs between 20.5 and 24.8 kg/m2. The largest possible difference in initial BMI between any two mothers in this middle half sample is 4.3.

 

p100 - p50=20.1: Approximately 50% of the mothers in the sample had initial BMIs between 22.5 and 42.6 kg/m2. The largest possible difference in initial BMI between any two mothers in this upper half sample is 20.1.

 

p100 - p75 = 42.6 - 24.8 = 17.8: Approximately 25% of the mothers in the sample had initial BMIs between 24.8 and 42.6 kg/m2. The largest possible difference in initial BMI between any two mothers in this upper quarter sample is 17.8 .

 

The Other Ranges

 

p100 - p0 = 42.6 - 17.2 = 25.4: 100% of the mothers in the sample had initial BMIs between 17.2 and 42.6 kg/m2. The largest possible difference in initial BMI between any two mothers in the total sample is 17.8

 

p100 - p25 = 42.6 - 20.5 = 22.1: Approximately 75% of the mothers in the sample had initial BMIs between 20.5 and 42.6 kg/m2. The largest possible difference in initial BMI between any two mothers in this upper-three-quarter sample is 22.1

 

p50 - p0 = 22.5 - 17.2 = 5.3: Approximately 50% of the mothers in the sample had initial BMIs between 17.2 and 20.5 kg/m2. The largest possible difference in initial BMI between any two mothers in this lower half sample is 5.3.

 

p100 - p50=20.1: Approximately 50% of the mothers in the sample had initial BMIs between 22.5 and 42.6 kg/m2. The largest possible difference in initial BMI between any two mothers in this upper half sample is 20.1.

 

p75 - p50 = 24.8 - 22.5 = 2.3: Approximately 25% of the mothers in the sample had initial BMIs between 22.5 and 24.8 kg/m2. The largest possible difference in initial BMI between any two mothers in this upper-middle-quarter sample is 2.3.

 

p50 - p25 = 22.5 - 20.5 = 2.0: Approximately 25% of the mothers in the sample had initial BMIs between 20.5 and 22.5 kg/m2. The largest possible difference in initial BMI between any two mothers in this lower-middle-quarter sample is 2.0 .

 

p25 - p0 = 20.5 - 17.2 = 3.3: Approximately 25% of the mothers in the sample had initial BMIs between 17.2 and 20.5 kg/m2. The largest possible difference in initial BMI between any two mothers in this lower-quarter sample is 3.3 .

 

Case 3.1

Descriptive Statistics

Serum Creatinine and Kidney (Renal) Function

Healthy kidneys remove wastes and excess fluid from the blood. Blood tests show whether the kidneys are failing to remove wastes. Urine tests can show how quickly bdy wastes are being removed and whether the kidneys are also leaking abnormal amounts of protein. The nephron is the basic structure in the kidney that produces urine. In a healthy kidney there may be as many as 1,000,000 nephrons. Loss of nephrons reduces the ability of the kidney to function by reducing the kidney’s ability to produce urine. Progressive loss of nephrons leads to kidney failure. Serum creatinine. Creatinine is a waste product that comes from meat protein in the diet and also comes from the normal wear and tear on muscles of the body. Creatinine is produced at a continuous rate and is excreted only through the kidneys. When renal dysfunction occurs, the kidneys are impaired in their ability to excrete creatinine and the serum creatinine rises. As kidney disease progresses, the level of creatinine in the blood increases.

Suppose that we sample serum creatinine levels in a random sample of adults. Serum creatinine (as mg/dL) for each sampled subject follows:

35.0, 14.5, 14.2, 13.8, 13.5, 13.1, 12.2, 11.1, 10.1, 9.8, 8.1, 7.3, 5.1, 5.0, 4.9, 4.8, 4.0, 3.5, 3.3, 3.2, 3.2, 2.9, 2.5, 2.3, 2.1, 2.0, 1.9, 1.9, 1.8, 1.6, 1.5, 1.5, 1.4, 1.4, 1.3, 1.3, 1.3, 1.2, 1.2, 1.1, 1.12, 1.09, 1.05, 0.95, 0.92, 0.9, 0.9, 0.9, 0.9, 0.8, 0.8, 0.8, 0.8, 0.8, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.6, 0.6, 0.6, 0.6, 0.3, 0.2

Compute and interpret the following statistics: sample size (n), p00, p25, p50, p75, p100, (p75-p00), (p100-p50), (p75-p25), (p50-p00).

Numbers

       

  N        Q0        Q1          Q2          Q3         Q4

   69       0.2       0.8         1.3         3.5        35

 

p75 – p00 = 3.5 – .2 = 3.3

p100 – p50 = 35 – 1.3 = 33.7

p75 – p25 = 3.5 – .8 = 2.7

p50 – p00 = 1.3 – .2 = 1.1

Discussion

There are 69 subjects in the sample.

The subject in the sample with the lowest serum creatinine level has .2 mg creatinine per dL serum.

Approximately 25% of the subjects in the sample have serum creatinine levels of .8 mg creatine per dL serum or less.

Approximately 50% of the subjects in the sample have serum creatinine levels of 1.3 mg creatine per dL serum or less.

Approximately 75% of the subjects in the sample have serum creatinine levels of 3.5 mg creatine per dL serum or less.

The subject in the sample with the highest serum creatinine level has 35 mg creatinine per dL serum.

Approximately 75% of the subjects in the sample have serum creatine levels between .2 and 3.5 mg creatinine per dL serum, and the largest possible difference in serum creatinine level between any pair of subjects in this lower three-quarter-sample is 3.3 mg creatinine per dL serum.

Approximately 50% of the subjects in the sample have serum creatine levels between .8 and 3.5 mg creatinine per dL serum, and the largest possible difference in serum creatinine level between any pair of subjects in this middle-half-sample is 2.7 mg creatinine per dL serum.

Approximately 50% of the subjects in the sample have serum creatine levels between 1.3 and 35 mg creatinine per dL serum, and the largest possible difference in serum creatinine level between any pair of subjects in this upper-half-sample is 33.7 mg creatinine per dL serum.

Approximately 50% of the subjects in the sample have serum creatine levels between .3 and 1.3 mg creatinine per dL serum, and the largest possible difference in serum creatinine level between any pair of subjects in this lower-half-sample is 1.1 mg creatinine per dL serum.

Part Three

Case 3.2

Descriptive Statistics                    

Angry Barrels of Monkeys

A company, BarrelCorpÔ manufactures barrels and wishes to ensure the strength and quality of its barrels. Chimpanzees traumatized the company owner as a youth; so the company uses the following test (Angry_Barrel_of_Monkeys_Test) of its barrels:

         Ten (10) chimpanzees are loaded into the barrel.

The chimpanzees are exposed to Angry!Monkey!Gas!ä, an agent guaranteed to drive the chimpanzees to a psychotic rage.

The angry, raging, psychotic chimpanzees then destroy the barrel from the inside in an angry, raging, psychotic fashion.

The survival time, in minutes, of the barrel is noted.

A random sample of 50 BarrelCorpÔ barrels is evaluated using the Angry_Barrel_of_Monkeys_Test, and the survival time (in ***MINUTES***) of each barrel is noted. The survival time of each barrel is listed below:

03, 05, 07, 12, 12, 14, 17, 19, 22, 23, 25, 25, 26, 26, 26, 27, 27,

28, 28, 29, 29, 30, 30, 30, 30, 30, 30, 31, 31, 32, 32, 34, 34, 35,

36, 37, 38, 38, 40, 43, 48, 51, 53, 54, 56, 57, 58, 58, 60, 62 

Compute and interpret the following measures of location or dispersion: sample size; mean, median; percentiles: 0th , 25th , 50th , 75th , 100th ; ( P100 - P75 ) ; iqr, range

Numbers

  n      Q0        Q1          Q2          Q3         Q4

   50        3        26          30          38         62

p100 – p75 = 62 – 38 = 24

p75 – p25 = 38 – 26 = 12

p100 – p00 = 62 – 3 = 59

Discussion

There are 50 barrels in the sample.

The barrel in the sample with the briefest survival survived 3 minutes of aggravated monkey damage.

Approximately 25% of the barrels in the sample survived 26 minutes of aggravated monkey damage or less.

Approximately 50% of the barrels in the sample survived 30 minutes of aggravated monkey damage or less.

Approximately 75% of the barrels in the sample survived 38 minutes of aggravated monkey damage or less.

The barrel in the sample with the longest survival survived 62 minutes of aggravated monkey damage.

Approximately 25% of the barrels in the sample survived between 38 and 62 minutes of aggravated monkey damage, and the largest possible difference in survival time between any pair of barrels in this upper-quarter-sample is 24 minutes.

Approximately 50% of the barrels in the sample survived between 26 and 38 minutes of aggravated monkey damage, and the largest possible difference in survival time between any pair of barrels in this middle-half-sample is 12 minutes.

100% of the barrels in the sample survived between 3 and 62 minutes of aggravated monkey damage, and the largest possible difference in survival time between any pair of barrels in the sample is 59 minutes.

Descriptive Summary Intervals

Links

http://www.pages.drexel.edu/~tpm23/Stat201Spr04/EmpiricalTchebysheff.pdf

http://knowledgerush.com/kr/encyclopedia/Tchebysheff's_theorem/

http://faculty.roosevelt.edu/currano/M347/Lectures/3.11.Example.pdf

http://www.mathstat.carleton.ca/~lhaque/2507-chap2a.pdf

http://commons.bcit.ca/math/faculty/david_sabo/apples/math2441/section4/roughcuts/roughcuts.htm

From http://www.mindspring.com/~cjalverson/_2ndhourlyfall2008verB_key.htm:

Case Four | Summary Intervals | Fictitious Striped Lizard

 

The Fictitious Striped Lizard is a native species of Lizard Island, and is noteworthy for the both the quantity and quality of its spots. Consider a random sample of Fictitious Striped Lizards, in which the number of stripes per lizard is noted:

 

1, 2, 3, 3, 4, 5, 6, 6, 7, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 11, 11, 11, 12, 13, 13, 14, 14, 14, 14, 15, 15, 15, 15, 16, 16, 16, 17, 17, 17, 17, 18, 21, 21, 21, 22, 24, 24, 24, 25, 25, 27

 

Let m denote the sample mean, and sd the sample standard deviation. Compute and interpret the intervals m±2sd and m±3sd, using Tchebysheff’s Inequalities and the Empirical Rule. Be specific and complete. Show your work, and discuss completely for full credit.

 

Numbers

 

n       m             sd             lower2     upper2     lower3      upper3

51    13.5294    6.49724    0.53493    26.5239    -5.96231    33.0211

 

We’re working with counts….

 

Short Interval, Raw: [0.53493    26.5239], restricted to [1, 26].

 

0 [ ||1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26|| ] 27 28 29 30

 

Long Interval, Raw: [ -5.96231    33.0211], restricted to [0, 33].

 

-6 [ -5 -4 -3 -2 -1 ||0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33|| ] 34

 

Short Interval: m ± (2*sd)

 

Lower Bound = m ─ (2*sd)  ≈ 13.5294     ─ (2*6.49724) ≈ 0.53493 [1]

Upper Bound = m + (2*sd)  ≈ 13.5294     + (2*6.49724) ≈ 26.5239 [26]

Long Interval: m ± (3*sd)

 

Lower Bound = m ─ (3*sd)  ≈ 13.5294     ─ (3*6.49724) ≈ -5.96231 [0]

Upper Bound = m + (3*sd)  ≈ 13.5294     + (3*6.49724) ≈ 33.0211 [33]

 

Interpretation

 

There are 51 Fictitious Striped lizards in our sample.

 

At least 75% of the lizards in our sample have between 1 and 26 stripes.

At least 89% of the lizards in our sample have between 0 and 33 stripes.

 

If the Fictitious Striped lizard stripe counts cluster symmetrically around a central value, becoming rare with increasing distance from the central value, then:

 

approximately 95% of the lizards in our sample have between 1 and 26 stripes.

 

and approximately 100% of the lizards in our sample have between 0 and 33 stripes.

 

From http://www.mindspring.com/~cjalverson/_2ndhourlyfall2006versionA_key.htm:

 

Case One

Descriptive Statistics

Serum Creatinine and Kidney (Renal) Function

Healthy kidneys remove wastes and excess fluid from the blood. Blood tests show whether the kidneys are failing to remove wastes. Urine tests can show how quickly bdy wastes are being removed and whether the kidneys are also leaking abnormal amounts of protein. The nephron is the basic structure in the kidney that produces urine. In a healthy kidney there may be as many as 1,000,000 nephrons. Loss of nephrons reduces the ability of the kidney to function by reducing the kidney’s ability to produce urine. Progressive loss of nephrons leads to kidney failure. Serum creatinine. Creatinine is a waste product that comes from meat protein in the diet and also comes from the normal wear and tear on muscles of the body. Creatinine is produced at a continuous rate and is excreted only through the kidneys. When renal dysfunction occurs, the kidneys are impaired in their ability to excrete creatinine and the serum creatinine rises. As kidney disease progresses, the level of creatinine in the blood increases.

Suppose that we sample serum creatinine levels in a random sample of adults. Serum creatinine (as mg/dL) for each sampled subject follows:

15.0, 14.5, 14.2, 13.8, 13.5, 13.1, 12.2, 11.1, 10.1, 9.8, 8.1, 7.3, 5.1, 5.0, 4.9, 4.8, 4.0, 3.5, 3.3, 3.2, 3.2, 2.9, 2.5, 2.3, 2.1, 2.0, 1.9, 1.9, 1.8, 1.6, 1.5, 1.5, 1.4, 1.4, 1.3, 1.3, 1.3, 1.2, 1.2, 1.1, 1.12, 1.09, 1.05, 0.95, 0.92, 0.9, 0.9, 0.9, 0.9, 0.8, 0.8, 0.8, 0.8, 0.8, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6

Compute and interpret the following statistics: sample size (n), p00, p25, p50, p75, p100, (p75-p00), (p100-p25), (p75-p50), (p50-p25). Be specific and complete. Show your work, and discuss completely for full credit.

Case Two

Summary Intervals

Serum Creatinine and Kidney (Renal) Function

 

Using the context and data from Case One, let m denote the sample mean, and sd the sample standard deviation. Compute and interpret the intervals  m ± 2sd and m ± 3sd, using Tchebysheff’s Inequalities and the Empirical Rule. Be specific and complete. Show your work, and discuss completely for full credit.

 

Numbers

 

        number of

        nonmissing                 the standard

         values,      the mean,     deviation,

         sercreat     sercreat       sercreat      m-3*sd    m+3*sd    m-2*sd    m+2*sd

            69           3.4           4.2                  -9.2           16.0         -5.0          11.8

 

n=69

m=3.4

sd=4.2

 

 

“Short Interval”

Lower2 = m – 2*sd = 3.4 – 2*4.2 = -5.0[0] (Negative concentrations don’t make sense here.)

Upper2 = m + 2*sd = 3.4 + 2*4.2 = 11.8

 

“Long Interval”

Lower3 = m – 3*sd = 3.4 – 3*4.2 = -9.2[0] (Negative concentrations don’t make sense here.)

Upper3 = m + 3*sd = 3.4 + 3*4.2 = 16.0

 

Interpretation

 

Tchebyshev’s Inequalities

 

At least 75% of the subjects in the sample have serum creatinine levels between 0 and 11.8 mg creatinine per deciliter serum.

 

At least 89% of the subjects in the sample have serum creatinine levels between 0 and 16.0 mg creatinine per deciliter serum.

 

Empirical Rule

 

If the serum creatinine levels cluster symmetrically around a central value, with values becoming progressively and symmetrically rarer with increasing distance from the central value, then …

 

approximately 95% of the subjects in the sample have serum creatinine levels between 0 and 11.8 mg creatinine per deciliter serum and

 

approximately 100% of the subjects in the sample have serum creatinine levels between 0 and 16.0 mg creatinine per deciliter serum.

Diseased Monkeys

A random sample of Lab Monkeys is infected with the agent that causes Disease X. The time (in hours) from infection to the appearance of symptoms of Disease X is measured for each monkey. The sample of monkeys yields the following times (in hours):

12, 26, 36, 38, 40, 42, 44, 48, 52, 62, 13, 27, 37, 38, 41, 42, 44, 49, 55, 65, 15, 30, 37, 39, 41, 44, 46, 50, 56, 70

16, 32, 38, 40, 42, 44, 48, 50, 58, 72, 18, 35, 40, 41, 42, 45, 48, 52, 58, 75

Edit the data into your calculator, and compute the following statistics: sample size (n), sample mean (m) and sample standard deviation (sd).

Compute the intervals m ± 2sd and m ± 3sd.

Apply and discuss the Empirical Rule for these intervals. Interpret each interval, using the context of the data. Do not simply state the value of the interval, interpret it. Be specific and complete.

Apply and discuss Tchebysheff’s Theorem for these intervals. Interpret each interval, using the context of the data. Do not simply state the value of the interval, interpret it. Be specific and complete.

Short Interval: m ± (2*sd)

 

Lower Bound = m ─ (2*sd)  ≈ 42.66 ─ (2*14.0968) ≈ 14.5

Upper Bound = m + (2*sd)  ≈ 42.66  + (2*14.0968) ≈ 70.8

 

Long Interval: m ± (3*sd)

 

Lower Bound = m ─ (3*sd)  ≈ 42.66 ─ (3*14.0968) ≈ 0.37

Upper Bound = m + (3*sd)  ≈ 42.66  + (3*14.0968) ≈ 84.9

 

At least 75% of the monkeys in the sample showed symptoms between 14.5 and 70.8 hours after exposure.

 

At least 89% of the monkeys in the sample showed symptoms between 0.37 and 84.9 hours after exposure.

 

If the monkey times-to-symptom cluster symmetrically around a central value, becoming rare with increasing distance from the central value, then:

 

Approximately 95% of the monkeys in the sample showed symptoms between 14.5 and 79.8 hours after exposure, and

 

Approximately 100% of the monkeys in the sample showed symptoms between 0.37 and 84.9 hours after exposure.

 

Barrel of Monkeysä

A random sample of people are selected, and their performance on the Barrel of Monkeysä game is measured.

Here are the instructions for this game: "Dump monkeys onto table. Pick up one monkey by an arm. Hook other arm through a second monkey's arm. Continue making a chain. Your turn is over when a monkey is dropped."

Each person makes one chain of monkeys, and the number of monkeys in each chain is recorded:

1, 2, 5, 2, 9, 12, 8, 7, 10, 9, 6, 4, 6, 9, 3, 12, 11, 10, 8, 4, 12, 7, 8, 6, 7, 8, 6, 5, 9, 10, 7, 5, 4, 3, 10, 7

7, 6, 8, 6, 6, 6, 6, 7, 8, 8, 7, 8

 

Edit the data into your calculator, and compute the following statistics: sample size (n), sample mean (m) and sample standard deviation (sd).

Compute the intervals m ± 2sd and m ± 3sd.

Apply and discuss the Empirical Rule for these intervals. Interpret each interval, using the context of the data. Do not simply state the value of the interval, interpret it. Be specific and complete.

Apply and discuss Tchebysheff’s Theorem for these intervals. Interpret each interval, using the context of the data. Do not simply state the value of the interval, interpret it. Be specific and complete.

n       m              sd              Lower2SD       Upper2SD       Lower3SD        Upper3SD

48    6.97917    2.59697     1.78523[2]     12.1731[12]    -0.81173[0]     14.7701[14]

 

We’re working with counts….

 

Short Interval, Raw: [1.78523, 12.1731], restricted to [2, 12].

 

-1 --- 0 --- 1 - [-- ||2 --- 3 --- 4 --- 5 --- 6 --- 7 --- 8 --- 9 --- 10 --- 11 --- 12|| -] -- 13 --- 14 --- 15

 

Long Interval, Raw: [-0.81173, 14.7701], restricted to [0, 14] or to [1, 14].

 

-1 -- [- ||0 --- 1 --- 2 --- 3 --- 4 --- 5 --- 6 --- 7 --- 8 --- 9 --- 10 --- 11 --- 12 --- 13 --- 14|| -- ] - 15

 

Short Interval: m ± (2*sd)

 

Lower Bound = m ─ (2*sd)  ≈ 6.97917 ─ (2*2.59697) ≈ 2

Upper Bound = m + (2*sd)  ≈ 6.97917  + (2*2.59697) ≈ 12

 

Long Interval: m ± (3*sd)

 

Lower Bound = m ─ (3*sd)  ≈ 6.97917 ─ (3*2.59697) ≈ 0 (or 1)

Upper Bound = m + (3*sd)  ≈ 6.97917  + (3*2.59697) ≈ 14

 

At least 75% of the monkey chains in the sample had between 2 ands 12 monkeys.

 

At least 89% of the monkey chains in the sample had between 0 (or 1) and 14 monkeys.

 

If the monkey chain counts cluster symmetrically around a central value, becoming rare with increasing distance from the central value, then:

 

approximately 95% of the monkey chains in the sample showed between 2 and 12 monkeys and

 

approximately 100% of the monkey chains in the sample showed between 0 (or 1) and 14 monkeys.