Summaries
Session 1.1
Sampling a Simple Population
We use random sampling to estimate
an empirical model of a population. We check the empirical model by direct
inspection of the population. We repeat sampling with replacement, obtaining
multiple random samples from the same population, obtained in the same process.
We combine (pool) compatible samples to form larger samples. Pooling samples of
size 50, we obtain samples of size 100, 150 and 300. In general, as sample size
increases, samples become more precise and reliable, provided that the sampling
process is reliable.
Random sampling is the basis for
obtaining information in statistical activities. Sampling is necessary,
tedious, time consuming and expensive. Random sampling incorporates reliability,
precision and uncertainty.
In this session, we begin the study
of probability. We begin with a very basic example of a population, and explore
the process of sampling a population.
In our first case, we begin with a
fair, six-sided die. We track predicted and observed face values in six random
samples of 50 tosses of the die. We then compare our samples to what is
expected under the fair model.
We examine two modes of sampling a
population: census (total enumeration), in which every member of the population
is examined; and random sampling with replacement (SRS/WR), in which
single members are repeatedly selected from the population. One practical
reason why we would want a sampling process is that we wish to estimate some
property of the population. Total enumeration allows a definitive settling of
the question, and random sampling allows an approximate answer. In most
practical settings, the populations of interest are too difficult to totally
enumerate – the population is too large, or too complex, or cannot be accessed
in total. In practical applications, it is sufficient (and usually necessary)
to use a suitable random sample in lieu of the total population.
In our second case, we begin with a
color bowl whose true color frequencies are known. We compute a population
frequency model and then compute the expected structure for random samples from
that bowl. We obtain six (6) random samples, each consisting of 50 draws with
replacement (SRS/WR). We then compute sample color frequencies and compare them
to the true structure of the bowl.
We then explore a bit of decision
theory by playing with Ellsberg’s Urns.
Prediction and
Probabilistic Randomness: Predicting the Behavior of a Six-sided Die
Process
We have a fair,
six-sided die, with face values 1, 2, 3, 4, 5 and 6. Prior to each toss of the
die, a member of the group predicts the face value that will be observed on
that toss. Upon tossing the die, the group notes the observed face value, as
well as the correctness (or lack thereof) of the prediction. Each group
produces a sample of 50 tosses.
Sample Worksheet
Prediction and the Fair Die
Sample Grid n=50
Each cell corresponds to a single
toss of the die.
X |
|
|
|
|
|
|
|
|
|
X |
|
|
|
|
|
|
|
|
|
X |
|
|
|
|
|
|
|
|
|
X |
|
|
|
|
|
|
|
|
|
X |
X |
X |
X |
X |
X |
X |
X |
X |
X |
Sample results are tabulated in the
form below.
Face
Value |
Count |
Prediction |
Count |
1 |
|
Hit(Correct) |
|
2 |
|
Miss(Incorrrect) |
|
3 |
|
Total |
|
4 |
|
|
|
5 |
|
|
|
6 |
|
|
|
Total |
|
|
|
Samples – Face Values
and Predictions
Here are the results for
our six samples. You should be able to begin with the counts in the table and work
out the proportions and percentages.
6:30 SamplesIn the fair die model for
this case, in long runs of tosses of the die: approximately 16⅔% of
tosses show “1”, approximately 16⅔% of tosses show “2”, approximately 16⅔%
of tosses show “3”, approximately 16⅔% of tosses show “4”, approximately
16⅔% of tosses show “5”, and approximately 16⅔% of tosses show “6.”
The sample data are generally compatible with a fair die assumption
(equally-likely face values) and with a baseline expected prediction success
rate of (1/6), or 16⅔%. Sample performance seems to improve with
increasing sample size – but the samples do not exactly fit the fair
assumption.
Sample versus Fair Model
6:30
#1 |
#2 |
12 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
11 |
0.22 |
22 |
1 |
8 |
0.16 |
16 |
1 |
19 |
0.19 |
19 |
2 |
8 |
0.16 |
16 |
2 |
8 |
0.16 |
16 |
2 |
16 |
0.16 |
16 |
3 |
10 |
0.2 |
20 |
3 |
6 |
0.12 |
12 |
3 |
16 |
0.16 |
16 |
4 |
7 |
0.14 |
14 |
4 |
5 |
0.1 |
10 |
4 |
12 |
0.12 |
12 |
5 |
9 |
0.18 |
18 |
5 |
11 |
0.22 |
22 |
5 |
20 |
0.2 |
20 |
6 |
5 |
0.1 |
10 |
6 |
12 |
0.24 |
24 |
6 |
17 |
0.17 |
17 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
10 |
0.2 |
20 |
Hit |
7 |
0.14 |
14 |
Hit |
17 |
0.17 |
17 |
Miss |
40 |
0.8 |
80 |
Miss |
43 |
0.86 |
86 |
Miss |
83 |
0.83 |
83 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Samples |
Samples |
Pooled |
|||||||||
#3 |
#4 |
34 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
7 |
0.14 |
14 |
1 |
9 |
0.18 |
18 |
1 |
16 |
0.16 |
16 |
2 |
7 |
0.14 |
14 |
2 |
10 |
0.2 |
20 |
2 |
17 |
0.17 |
17 |
3 |
11 |
0.22 |
22 |
3 |
9 |
0.18 |
18 |
3 |
20 |
0.2 |
20 |
4 |
9 |
0.18 |
18 |
4 |
7 |
0.14 |
14 |
4 |
16 |
0.16 |
16 |
5 |
10 |
0.2 |
20 |
5 |
7 |
0.14 |
14 |
5 |
17 |
0.17 |
17 |
6 |
6 |
0.12 |
12 |
6 |
8 |
0.16 |
16 |
6 |
14 |
0.14 |
14 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
5 |
0.1 |
10 |
Hit |
7 |
0.14 |
14 |
Hit |
12 |
0.12 |
12 |
Miss |
45 |
0.9 |
90 |
Miss |
43 |
0.86 |
86 |
Miss |
88 |
0.88 |
88 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Samples |
Samples |
Pooled |
|||||||||
#5 |
#6 |
56 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
15 |
0.3 |
30 |
1 |
11 |
0.22 |
22 |
1 |
26 |
0.26 |
26 |
2 |
6 |
0.12 |
12 |
2 |
5 |
0.1 |
10 |
2 |
11 |
0.11 |
11 |
3 |
8 |
0.16 |
16 |
3 |
9 |
0.18 |
18 |
3 |
17 |
0.17 |
17 |
4 |
7 |
0.14 |
14 |
4 |
11 |
0.22 |
22 |
4 |
18 |
0.18 |
18 |
5 |
8 |
0.16 |
16 |
5 |
6 |
0.12 |
12 |
5 |
14 |
0.14 |
14 |
6 |
6 |
0.12 |
12 |
6 |
8 |
0.16 |
16 |
6 |
14 |
0.14 |
14 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
6 |
0.12 |
12 |
Hit |
7 |
0.14 |
14 |
Hit |
13 |
0.13 |
13 |
Miss |
44 |
0.88 |
88 |
Miss |
43 |
0.86 |
86 |
Miss |
87 |
0.87 |
87 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Pooled |
Pooled |
Pooled |
|||||||||
135 |
246 |
123456 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
33 |
0.22 |
22 |
1 |
28 |
0.186666667 |
18.6667 |
1 |
61 |
0.203333333 |
20.3333 |
2 |
21 |
0.14 |
14 |
2 |
23 |
0.153333333 |
15.3333 |
2 |
44 |
0.146666667 |
14.6667 |
3 |
29 |
0.1933333 |
19.3333 |
3 |
24 |
0.16 |
16 |
3 |
53 |
0.176666667 |
17.6667 |
4 |
23 |
0.1533333 |
15.3333 |
4 |
23 |
0.153333333 |
15.3333 |
4 |
46 |
0.153333333 |
15.3333 |
5 |
27 |
0.18 |
18 |
5 |
24 |
0.16 |
16 |
5 |
51 |
0.17 |
17 |
6 |
17 |
0.1133333 |
11.3333 |
6 |
28 |
0.186666667 |
18.6667 |
6 |
45 |
0.15 |
15 |
Total |
150 |
1 |
100 |
Total |
150 |
1 |
100 |
Total |
300 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
21 |
0.14 |
14 |
Hit |
21 |
0.14 |
14 |
Hit |
42 |
0.14 |
14 |
Miss |
129 |
0.86 |
86 |
Miss |
129 |
0.86 |
86 |
Miss |
258 |
0.86 |
86 |
Total |
150 |
1 |
100 |
Total |
150 |
1 |
100 |
Total |
300 |
1 |
100 |
Face Value 1: 20.3%
(Sample) versus 16.67% (Fair Model)
Face Value 2: 14.7%
(Sample) versus 16.67% (Fair Model)
Face Value 3: 17.7%
(Sample) versus 16.67% (Fair Model)
Face Value 4: 15.3%
(Sample) versus 16.67% (Fair Model)
Face Value 5: 17%
(Sample) versus 16.67% (Fair Model)
Face Value 6: 15%
(Sample) versus 16.67% (Fair Model)
Prediction “Hit”: 14%
(Sample) versus 16.67% (Fair Model)
Prediction “Miss”: 86%
(Sample) versus 83.33% (Fair Model))
8:00 Samples
#1 |
#2 |
12 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
8 |
0.16 |
16 |
1 |
11 |
0.22 |
22 |
1 |
19 |
0.19 |
19 |
2 |
8 |
0.16 |
16 |
2 |
6 |
0.12 |
12 |
2 |
14 |
0.14 |
14 |
3 |
7 |
0.14 |
14 |
3 |
6 |
0.12 |
12 |
3 |
13 |
0.13 |
13 |
4 |
9 |
0.18 |
18 |
4 |
5 |
0.1 |
10 |
4 |
14 |
0.14 |
14 |
5 |
12 |
0.24 |
24 |
5 |
15 |
0.3 |
30 |
5 |
27 |
0.27 |
27 |
6 |
6 |
0.12 |
12 |
6 |
7 |
0.14 |
14 |
6 |
13 |
0.13 |
13 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
5 |
0.1 |
10 |
Hit |
6 |
0.12 |
12 |
Hit |
11 |
0.11 |
11 |
Miss |
45 |
0.9 |
90 |
Miss |
44 |
0.88 |
88 |
Miss |
89 |
0.89 |
89 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Samples |
Samples |
Pooled |
|||||||||
#3 |
#4 |
34 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
6 |
0.12 |
12 |
1 |
9 |
0.18 |
18 |
1 |
15 |
0.15 |
15 |
2 |
6 |
0.12 |
12 |
2 |
7 |
0.14 |
14 |
2 |
13 |
0.13 |
13 |
3 |
10 |
0.2 |
20 |
3 |
8 |
0.16 |
16 |
3 |
18 |
0.18 |
18 |
4 |
11 |
0.22 |
22 |
4 |
9 |
0.18 |
18 |
4 |
20 |
0.2 |
20 |
5 |
6 |
0.12 |
12 |
5 |
9 |
0.18 |
18 |
5 |
15 |
0.15 |
15 |
6 |
11 |
0.22 |
22 |
6 |
8 |
0.16 |
16 |
6 |
19 |
0.19 |
19 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
7 |
0.14 |
14 |
Hit |
5 |
0.1 |
10 |
Hit |
12 |
0.12 |
12 |
Miss |
43 |
0.86 |
86 |
Miss |
45 |
0.9 |
90 |
Miss |
88 |
0.88 |
88 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Samples |
Samples |
Pooled |
|||||||||
#5 |
#6 |
56 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
3 |
0.06 |
6 |
1 |
7 |
0.14 |
14 |
1 |
10 |
0.1 |
10 |
2 |
13 |
0.26 |
26 |
2 |
8 |
0.16 |
16 |
2 |
21 |
0.21 |
21 |
3 |
6 |
0.12 |
12 |
3 |
3 |
0.06 |
6 |
3 |
9 |
0.09 |
9 |
4 |
10 |
0.2 |
20 |
4 |
10 |
0.2 |
20 |
4 |
20 |
0.2 |
20 |
5 |
10 |
0.2 |
20 |
5 |
13 |
0.26 |
26 |
5 |
23 |
0.23 |
23 |
6 |
8 |
0.16 |
16 |
6 |
9 |
0.18 |
18 |
6 |
17 |
0.17 |
17 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
9 |
0.18 |
18 |
Hit |
6 |
0.12 |
12 |
Hit |
15 |
0.15 |
15 |
Miss |
41 |
0.82 |
82 |
Miss |
44 |
0.88 |
88 |
Miss |
85 |
0.85 |
85 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
Pooled |
Pooled |
Pooled |
|||||||||
135 |
246 |
123456 |
|||||||||
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
Face Value |
Count |
Proportion |
Percent |
1 |
17 |
0.113333 |
11.333 |
1 |
27 |
0.18 |
18 |
1 |
44 |
0.146667 |
14.667 |
2 |
27 |
0.18 |
18 |
2 |
21 |
0.14 |
14 |
2 |
48 |
0.16 |
16 |
3 |
23 |
0.153333 |
15.333 |
3 |
17 |
0.113333 |
11.333 |
3 |
40 |
0.133333 |
13.333 |
4 |
30 |
0.2 |
20 |
4 |
24 |
0.16 |
16 |
4 |
54 |
0.18 |
18 |
5 |
28 |
0.186667 |
18.667 |
5 |
37 |
0.246667 |
24.667 |
5 |
65 |
0.216667 |
21.667 |
6 |
25 |
0.166667 |
16.667 |
6 |
24 |
0.16 |
16 |
6 |
49 |
0.163333 |
16.333 |
Total |
150 |
1 |
100 |
Total |
150 |
1 |
100 |
Total |
300 |
1 |
100 |
Prediction |
Prediction |
Prediction |
|||||||||
Hit |
21 |
0.14 |
14 |
Hit |
17 |
0.113333 |
11.333 |
Hit |
38 |
0.126667 |
12.667 |
Miss |
129 |
0.86 |
86 |
Miss |
133 |
0.886667 |
88.667 |
Miss |
262 |
0.873333 |
87.333 |
Total |
150 |
1 |
100 |
Total |
150 |
1 |
100 |
Total |
300 |
1 |
100 |
In the fair die model
for this case, in long runs of tosses of the die: approximately 16⅔% of
tosses show “1”, approximately 16⅔% of tosses show “2”, approximately 16⅔%
of tosses show “3”, approximately 16⅔% of tosses show “4”, approximately
16⅔% of tosses show “5”, and approximately 16⅔% of tosses show “6.”
The sample data are generally compatible with a fair die assumption
(equally-likely face values) and with a baseline expected prediction success
rate of (1/6), or 16⅔%. Sample performance seems to improve with
increasing sample size – but the samples do not exactly fit the fair
assumption.
Sample versus Fair Model
8:00
Face Value 1: 14.67%
(Sample) versus 16.67% (Fair Model)
Face Value 2: 16%
(Sample) versus 16.67% (Fair Model)
Face Value 3: 13.3%
(Sample) versus 16.67% (Fair Model)
Face Value 4: 18%
(Sample) versus 16.67% (Fair Model)
Face Value 5: 21.7%
(Sample) versus 16.67% (Fair Model)
Face Value 6: 16.3% (Sample)
versus 16.67% (Fair Model)
Prediction “Hit”: 12.7%
(Sample) versus 16.67% (Fair Model)
Prediction “Miss”: 87.3%
(Sample) versus 83.33% (Fair Model))
Case Study 1.1: A Color
Bowl
In random sampling, we
might get a complete list of colors - we'd need a total sample (census) for
that kind of listing. The sample proportions of each listed color approximate
the corresponding model proportion in the bowl itself. In census sampling,
every object in the bowl is counted. The listing is complete, and the model
proportions may be calculated directly.
The basic idea in case study 1.1 is
that random samples give imperfect pictures of what is being sampled. However,
with sufficiently large samples, these samples can reliably yield good pictures
of the processes or populations being sampled. And the essence of many
statistical applications is the study of selected processes or populations. For
a sense of the efficiency of the samples, compare sample and true percentages.
Process
We have a four color
bowl, with blue, green, red
and yellow marbles. Prior to each draw
from the bowl, the bowl is thoroughly mixed, giving each resident marble an
approximately equal chance of selection. After mixing, a blind (made without
looking into the bowl) draw of a single marble is made. The group notes the
color of the marble, and the marble is returned to the bowl – this is sampling
with replacement. The mixing makes the sampling random. Each group produces a
sample of 50 tosses.
Each cell corresponds to a single
draw with replacement from the bowl.
Sample Grid (n=50)
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sample results are tabulated in the
form below.
Table – Draws with Replacement
Color |
Count |
Blue |
|
Green |
|
Red |
|
Yellow |
|
Total |
|
Samples from the Color Bowl
Here are the six samples
from our groups. You should be able to begin with the counts in the table and
work out the proportions and percentages.
6:30 Samples
#1 |
#2 |
Pooled 12 |
||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
Blue |
9 |
0.18 |
18 |
Blue |
13 |
0.26 |
26 |
Blue |
22 |
0.22 |
22 |
17.647 |
Green |
9 |
0.18 |
18 |
Green |
4 |
0.08 |
8 |
Green |
13 |
0.13 |
13 |
11.765 |
Red |
13 |
0.26 |
26 |
Red |
18 |
0.36 |
36 |
Red |
31 |
0.31 |
31 |
29.412 |
Yellow |
19 |
0.38 |
38 |
Yellow |
15 |
0.3 |
30 |
Yellow |
34 |
0.34 |
34 |
41.176 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
100 |
#3 |
#4 |
Pooled 34 |
||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
Blue |
8 |
0.16 |
16 |
Blue |
7 |
0.14 |
14 |
Blue |
15 |
0.15 |
15 |
17.647 |
Green |
5 |
0.1 |
10 |
Green |
6 |
0.12 |
12 |
Green |
11 |
0.11 |
11 |
11.765 |
Red |
17 |
0.34 |
34 |
Red |
23 |
0.46 |
46 |
Red |
40 |
0.4 |
40 |
29.412 |
Yellow |
20 |
0.4 |
40 |
Yellow |
14 |
0.28 |
28 |
Yellow |
34 |
0.34 |
34 |
41.176 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
100 |
#5 |
#6 |
Pooled 56 |
||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
Blue |
9 |
0.18 |
18 |
Blue |
7 |
0.14 |
14 |
Blue |
16 |
0.16 |
16 |
17.647 |
Green |
1 |
0.02 |
2 |
Green |
6 |
0.12 |
12 |
Green |
7 |
0.07 |
7 |
11.765 |
Red |
22 |
0.44 |
44 |
Red |
14 |
0.28 |
28 |
Red |
36 |
0.36 |
36 |
29.412 |
Yellow |
18 |
0.36 |
36 |
Yellow |
23 |
0.46 |
46 |
Yellow |
41 |
0.41 |
41 |
41.176 |
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
100 |
Pooled 135 |
Pooled 246 |
Pooled All |
||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
Blue |
26 |
0.173333 |
17.333 |
Blue |
27 |
0.18 |
18 |
Blue |
53 |
0.176667 |
17.667 |
17.647 |
Green |
15 |
0.1 |
10 |
Green |
16 |
0.106667 |
10.67 |
Green |
31 |
0.103333 |
10.333 |
11.765 |
Red |
52 |
0.346667 |
34.667 |
Red |
55 |
0.366667 |
36.67 |
Red |
107 |
0.356667 |
35.667 |
29.412 |
Yellow |
57 |
0.38 |
38 |
Yellow |
52 |
0.346667 |
34.67 |
Yellow |
109 |
0.363333 |
36.333 |
41.176 |
Total |
150 |
1 |
100 |
Total |
150 |
1 |
100 |
Total |
300 |
1 |
100 |
100 |
Bowl/Model |
||||||||||||
Color |
Count |
Proportion |
Percent |
E50 |
E100 |
E150 |
E200 |
E250 |
E300 |
|||
Blue |
3 |
0.176471 |
17.647 |
8.82 |
17.6 |
26.47059 |
35.29 |
44.11765 |
52.9 |
|||
Green |
2 |
0.117647 |
11.765 |
5.88 |
11.8 |
17.64706 |
23.53 |
29.41176 |
35.3 |
|||
Red |
5 |
0.294118 |
29.412 |
14.7 |
29.4 |
44.11765 |
58.82 |
73.52941 |
88.2 |
|||
Yellow |
7 |
0.411765 |
41.176 |
20.6 |
41.2 |
61.76471 |
82.35 |
102.9412 |
124 |
|
|
|
Total |
17 |
1 |
100 |
50 |
100 |
150 |
200 |
250 |
300 |
Blue:
17.7% (Sample) versus17.6% (Model)
Green:
10.3% (Sample) versus 11.8% (Model)
Red:
35.7% (Sample) versus 29.4% (Model)
Yellow: 36.3% (Sample) versus 41.2% (Model)
8:00 Samples
8:00 |
|||||||||||||
Color Bowl I - Sampling with
Replacement |
|||||||||||||
#1 |
#2 |
Pooled 12 |
|||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
|
Blue |
10 |
0.2 |
20 |
Blue |
6 |
0.12 |
12 |
Blue |
16 |
0.16 |
16 |
11.1111 |
|
Green |
12 |
0.24 |
24 |
Green |
8 |
0.16 |
16 |
Green |
20 |
0.2 |
20 |
27.7778 |
|
Red |
25 |
0.5 |
50 |
Red |
31 |
0.62 |
62 |
Red |
56 |
0.56 |
56 |
50 |
|
Yellow |
3 |
0.06 |
6 |
Yellow |
5 |
0.1 |
10 |
Yellow |
8 |
0.08 |
8 |
11.1111 |
|
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
100 |
|
#3 |
#4 |
Pooled 34 |
|||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
|
Blue |
6 |
0.12 |
12 |
Blue |
5 |
0.1 |
10 |
Blue |
11 |
0.11 |
11 |
11.1111 |
|
Green |
9 |
0.18 |
18 |
Green |
14 |
0.28 |
28 |
Green |
23 |
0.23 |
23 |
27.7778 |
|
Red |
31 |
0.62 |
62 |
Red |
26 |
0.52 |
52 |
Red |
57 |
0.57 |
57 |
50 |
|
Yellow |
4 |
0.08 |
8 |
Yellow |
5 |
0.1 |
10 |
Yellow |
9 |
0.09 |
9 |
11.1111 |
|
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
100 |
|
#5 |
#6 |
Pooled 56 |
|||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
|
Blue |
7 |
0.14 |
14 |
Blue |
4 |
0.08 |
8 |
Blue |
11 |
0.11 |
11 |
11.1111 |
|
Green |
16 |
0.32 |
32 |
Green |
13 |
0.26 |
26 |
Green |
29 |
0.29 |
29 |
27.7778 |
|
Red |
21 |
0.42 |
42 |
Red |
29 |
0.58 |
58 |
Red |
50 |
0.5 |
50 |
50 |
|
Yellow |
6 |
0.12 |
12 |
Yellow |
4 |
0.08 |
8 |
Yellow |
10 |
0.1 |
10 |
11.1111 |
|
Total |
50 |
1 |
100 |
Total |
50 |
1 |
100 |
Total |
100 |
1 |
100 |
100 |
|
Pooled 135 |
Pooled 246 |
Pooled All |
|||||||||||
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Color |
Count |
Proportion |
Percent |
Truth |
|
Blue |
23 |
0.1533333 |
15.333 |
Blue |
15 |
0.1 |
10 |
Blue |
38 |
0.1266667 |
12.667 |
11.1111 |
|
Green |
37 |
0.2466667 |
24.667 |
Green |
35 |
0.2333333 |
23.333 |
Green |
72 |
0.24 |
24 |
27.7778 |
|
Red |
77 |
0.5133333 |
51.333 |
Red |
86 |
0.5733333 |
57.333 |
Red |
163 |
0.5433333 |
54.333 |
50 |
|
Yellow |
13 |
0.0866667 |
8.6667 |
Yellow |
14 |
0.0933333 |
9.3333 |
Yellow |
27 |
0.09 |
9 |
11.1111 |
|
Total |
150 |
1 |
100 |
Total |
150 |
1 |
100 |
Total |
300 |
1 |
100 |
100 |
|
Bowl/Model |
|||||||||||||
Color |
Count |
Proportion |
Percent |
E50 |
E100 |
E150 |
E200 |
E250 |
E300 |
||||
Blue |
2 |
0.1111111 |
11.111 |
5.5555556 |
11.11 |
16.666667 |
22.222 |
27.777778 |
33.333 |
||||
Green |
5 |
0.2777778 |
27.778 |
13.888889 |
27.78 |
41.666667 |
55.556 |
69.444444 |
83.333 |
||||
Red |
9 |
0.5 |
50 |
25 |
50 |
75 |
100 |
125 |
150 |
||||
Yellow |
2 |
0.1111111 |
11.111 |
5.5555556 |
11.11 |
16.666667 |
22.222 |
27.777778 |
33.333 |
|
|
|
|
Total |
18 |
1 |
100 |
50 |
100 |
150 |
200 |
250 |
300 |
||||
Blue:
12.7% (Sample) versus 11.1% (Model)
Green:
24% (Sample) versus 27.8% (Model)
Red:
54.3% (Sample) versus 50% (Model)
Yellow: 9% (Sample) versus 11.1% (Model)
Some Formulas –
Proportions, Percentages, Counts
The class represents
some property or attribute, for example, blue, or red.
Each member, or unit, of a sample can be classified – the result of the
classification of the unit is the unit’s class.
Sample
Proportion (p)
nclass ~ number of units of sample in class
ntotal ~ total number of units in sample
pclass = nclass / ntotal
pclass ~ proportion of sample in class
Sample
Percent (pct)
nclass ~ number of units of sample in class
ntotal ~ total number of units in sample
pclass = nclass / ntotal
pctclass = 100*(nclass / ntotal)
pctclass = 100* pclass
pctclass ~ percent of sample in class
Population
Proportion (P)
Nclass ~ number of units of population in class
Ntotal ~ total number of units in population
Pclass = Nclass / Ntotal
Pclass ~ proportion of population in class
Population
Percent (PCT)
Nclass ~ number of units of population in class
Ntotal ~ total number of units in population
Pclass = Nclass / Ntotal
PCTclass = 100*(Nclass / Ntotal)
PCTclass = 100* Pclass
PCTclass ~ percent of population in class
In this setting,
nblue ~ number of blue draws in sample ntotal ~ total number of draws per sample pblue = nblue / ntotal pblue ~ proportion of sample draws showing blue pctblue = 100*pblue pctblue ~ percent of sample draws showing blue Nblue ~ number of blue marbles in bowl Ntotal ~ total number of marbles in bowl Pblue = Nblue / Nblue Pblue ~ proportion of marbles in bowl that are blue |
ngreen ~ number of green draws in sample ntotal ~ total number of draws per sample pgreen = ngreen / ngreen pgreen ~ proportion of sample draws showing green pctgreen = 100*pgreen pctgreen ~ percent of sample draws showing green Ngreen ~ number of green marbles in bowl Ntotal ~ total number of marbles in bowl Pgreen = Ngreen / Ngreen Pgreen ~ proportion of marbles in bowl that are
green |
nred ~ number of red draws in sample ntotal ~ total number of draws per sample pred = nred / nred pred ~ proportion of sample draws showing red pctred = 100*pred pctred ~ percent of sample draws showing red Nred ~ number of red marbles in bowl Ntotal ~ total number of marbles in bowl Pred = Nred / Nred Pred ~ proportion of marbles in bowl that are red |
nyellow ~ number of yellow draws in sample ntotal ~ total number of draws per sample pyellow = nyellow / nyellow pyellow ~ proportion of sample draws showing
yellow pctyellow = 100*pyellow pctyellow ~ percent of sample draws showing
yellow Nyellow ~ number of yellow marbles in bowl Ntotal ~ total number of marbles in bowl Pyellow = Nyellow / Nyellow Pyellow ~ proportion of marbles in bowl that
are yellow |
We didn’t get to these, so read up on
Ellsberg I and Ellsberg II.
Regarding Ellsberg I
The 1st Game: The first bowl is 50%/50% split between blue and green. The best we can do is break even, regardless of strategy.
The simplest strategy involves picking one of the colors and always betting on
that color.
The 2nd Game: The second bowl is an unknown composite of red and yellow.
We might be able to win this game if 1) there is a
dominant color and 2) we can determine that dominant color. A simple strategy here
is to pick one color and ride it for awhile. Then stop betting and check the
number of winning bets. If the color being betted is losing on a regular basis,
switch colors.
The 3rd Game: This game only makes sense if the second bowl is dominant
in red, bet on red
– if red consistently shows, stay on the second
bowl. Otherwise, either stop playing, or stick with the first bowl.
Regarding Ellsberg II
The 1st Game: The first bowl is 20% red /
40% black / 40% white. The simplest strategy
involves picking one of the colors and always betting on that color. Regardless
of betting choice, there is a 40% chance of losing for the single bet, and 20%
for getting kicked off the game.
The 2nd Game: The second bowl is 20% red /
80% black or white. The simplest strategy involves
picking one of the colors and always betting on that color. If either white or
black is sufficiently dominant, this game might be worth playing. The problem
is that regardless of the possible advantage in the white/black part of the
bowl, there is still a 20% chance of getting killed (permanently losing). But
to detect this advantage, one is forced to pick a betting color (white or
black) and spend some money.
The idea underlying the Ellsberg
games is to illustrate the concept of making decisions about selected processes
or populations by making decisions using random samples.