Summaries

Session 0.1

18th August 2010

Session 1.1

Sampling a Simple Population

We use random sampling to estimate an empirical model of a population. We check the empirical model by direct inspection of the population. We repeat sampling with replacement, obtaining multiple random samples from the same population, obtained in the same process. We combine (pool) compatible samples to form larger samples. Pooling samples of size 50, we obtain samples of size 100, 150 and 300. In general, as sample size increases, samples become more precise and reliable, provided that the sampling process is reliable.

Random sampling is the basis for obtaining information in statistical activities. Sampling is necessary, tedious, time consuming and expensive. Random sampling incorporates reliability, precision and uncertainty.

In this session, we begin the study of probability. We begin with a very basic example of a population, and explore the process of sampling a population.

In our first case, we begin with a fair, six-sided die. We track predicted and observed face values in six random samples of 50 tosses of the die. We then compare our samples to what is expected under the fair model.

We examine two modes of sampling a population: census (total enumeration), in which every member of the population is examined; and random sampling with replacement (SRS/WR), in which single members are repeatedly selected from the population. One practical reason why we would want a sampling process is that we wish to estimate some property of the population. Total enumeration allows a definitive settling of the question, and random sampling allows an approximate answer. In most practical settings, the populations of interest are too difficult to totally enumerate – the population is too large, or too complex, or cannot be accessed in total. In practical applications, it is sufficient (and usually necessary) to use a suitable random sample in lieu of the total population.

In our second case, we begin with a color bowl whose true color frequencies are known. We compute a population frequency model and then compute the expected structure for random samples from that bowl. We obtain six (6) random samples, each consisting of 50 draws with replacement (SRS/WR). We then compute sample color frequencies and compare them to the true structure of the bowl.

We then explore a bit of decision theory by playing with Ellsberg’s Urns.

Prediction and Probabilistic Randomness: Predicting the Behavior of a Six-sided Die

Process

We have a fair, six-sided die, with face values 1, 2, 3, 4, 5 and 6. Prior to each toss of the die, a member of the group predicts the face value that will be observed on that toss. Upon tossing the die, the group notes the observed face value, as well as the correctness (or lack thereof) of the prediction. Each group produces a sample of 50 tosses.

Sample Worksheet

Prediction and the Fair Die

Sample Grid n=50

Each cell corresponds to a single toss of the die.

X

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

X

 

 

 

 

 

 

 

 

 

X

X

X

X

X

X

X

X

X

X

 

Sample results are tabulated in the form below.

Face Value

Count

Prediction

Count

1

 

Hit(Correct)

 

2

 

Miss(Incorrrect)

 

3

 

Total

 

4

 

 

 

5

 

 

 

6

 

 

 

Total

 

 

 

Samples – Face Values and Predictions

Here are the results for our six samples. You should be able to begin with the counts in the table and work out the proportions and percentages.

6:30 Samples

Prediction and the Fair Die

Samples

Samples

Pooled

#1

#2

12

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

14

14/50 = 0.28

100*.28=28

1

7

0.14

14

1

14+7=21

21/100=0.21

21

2

6

6/50 = 0.12

100*.12=12

2

10

0.2

20

2

6+10=16

16/100=0.16

16

3

9

9/50 = 0.18

100*.18=18

3

8

0.16

16

3

9+8=17

17/100=0.17

17

4

6

6/50 = 0.12

100*.12=12

4

10

0.2

20

4

6+10=16

16/100=0.16

16

5

9

9/50 = 0.18

100*.18=18

5

5

0.1

10

5

9+5=14

14/100=0.14

14

6

6

6/50 = 0.12

100*.12=12

6

10

0.2

20

6

6+10=16

16/100=0.16

16

Total

50

1

100

Total

50

1

100

Total

100

1

100

Prediction

Prediction

Prediction

Hit

9

0.18

18

Hit

13

0.26

26

Hit

22

0.22

22

Miss

41

0.82

82

Miss

37

0.74

74

Miss

78

0.78

78

Total

50

1

100

Total

50

1

100

Total

100

1

100

Samples

Samples

Pooled

#3

#4

34

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

4

0.08

8

1

10

0.2

20

1

14

0.14

14

2

12

0.24

24

2

6

0.12

12

2

18

0.18

18

3

7

0.14

14

3

8

0.16

16

3

15

0.15

15

4

6

0.12

12

4

12

0.24

24

4

18

0.18

18

5

14

0.28

28

5

9

0.18

18

5

23

0.23

23

6

7

0.14

14

6

5

0.1

10

6

12

0.12

12

Total

50

1

100

Total

50

1

100

Total

100

1

100

Prediction

Prediction

Prediction

Hit

8

0.16

16

Hit

6

0.12

12

Hit

14

0.14

14

Miss

42

0.84

84

Miss

44

0.88

88

Miss

86

0.86

86

Total

50

1

100

Total

50

1

100

Total

100

1

100

Samples

Samples

Pooled

#5

#6

56

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

8

0.16

16

1

8

0.16

16

1

16

0.16

16

2

10

0.2

20

2

6

0.12

12

2

16

0.16

16

3

10

0.2

20

3

5

0.1

10

3

15

0.15

15

4

6

0.12

12

4

9

0.18

18

4

15

0.15

15

5

9

0.18

18

5

8

0.16

16

5

17

0.17

17

6

7

0.14

14

6

14

0.28

28

6

21

0.21

21

Total

50

1

100

Total

50

1

100

Total

100

1

100

Prediction

Prediction

Prediction

Hit

13

0.26

26

Hit

8

0.16

16

Hit

21

0.21

21

Miss

37

0.74

74

Miss

42

0.84

84

Miss

79

0.79

79

Total

50

1

100

Total

50

1

100

Total

100

1

100

Pooled

Pooled

Pooled

135

246

123456

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

26

0.173333333

17.333

1

25

0.166666667

16.667

1

51

51/300 =0.17

17

2

28

0.186666667

18.667

2

22

0.146666667

14.667

2

50

50/300=0.1666667

16.667

3

26

0.173333333

17.333

3

21

0.14

14

3

47

47/300=0.1566667

15.667

4

18

0.12

12

4

31

0.206666667

20.667

4

49

50/300=0.1633333

16.333

5

32

0.213333333

21.333

5

22

0.146666667

14.667

5

54

54/300=0.18

18

6

20

0.133333333

13.333

6

29

0.193333333

19.333

6

49

49/300=0.1633333

16.333

Total

150

1

100

Total

150

1

100

Total

300

1

100

Prediction

Prediction

Prediction

Hit

30

0.2

20

Hit

27

0.18

18

Hit

57

0.19

19

Miss

120

0.8

80

Miss

123

0.82

82

Miss

243

0.81

81

Total

150

1

100

Total

150

1

100

Total

300

1

100

In the fair die model for this case, in long runs of tosses of the die: approximately 16⅔% of tosses show “1”, approximately 16⅔% of tosses show “2”, approximately 16⅔% of tosses show “3”, approximately 16⅔% of tosses show “4”, approximately 16⅔% of tosses show “5”, and approximately 16⅔% of tosses show “6.” The sample data are generally compatible with a fair die assumption (equally-likely face values) and with a baseline expected prediction success rate of (1/6), or 16⅔%. Sample performance seems to improve with increasing sample size – but the samples do not exactly fit the fair assumption.

Sample versus Fair Model

6:30

Face Value 1: 17% (Sample) versus 16.67% (Fair Model)

Face Value 2: 16.67% (Sample) versus 16.67% (Fair Model)

Face Value 3: 15.67% (Sample) versus 16.67% (Fair Model)

Face Value 4: 16.33% (Sample) versus 16.67% (Fair Model)

Face Value 5: 18% (Sample) versus 16.67% (Fair Model)

Face Value 6: 16.33% (Sample) versus 16.67% (Fair Model)

 

Prediction “Hit”: 19% (Sample) versus 16.67% (Fair Model)

Prediction “Miss”: 81% (Sample) versus 83.33% (Fair Model))

 

8:00 Samples

Prediction and the Fair Die

Samples

Samples

Pooled

#1

#2

12

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

7

0.14

14

1

6

0.12

12

1

13

0.13

13

2

6

0.12

12

2

5

0.1

10

2

11

0.11

11

3

14

0.28

28

3

10

0.2

20

3

24

0.24

24

4

9

0.18

18

4

7

0.14

14

4

16

0.16

16

5

6

0.12

12

5

16

0.32

32

5

22

0.22

22

6

8

0.16

16

6

6

0.12

12

6

14

0.14

14

Total

50

1

100

Total

50

1

100

Total

100

1

100

Prediction

Prediction

Prediction

Hit

10

0.2

20

Hit

8

0.16

16

Hit

18

0.18

18

Miss

40

0.8

80

Miss

42

0.84

84

Miss

82

0.82

82

Total

50

1

100

Total

50

1

100

Total

100

1

100

Samples

Samples

Pooled

#3

#4

34

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

6

0.12

12

1

7

0.14

14

1

13

0.13

13

2

5

0.1

10

2

7

0.14

14

2

12

0.12

12

3

12

0.24

24

3

9

0.18

18

3

21

0.21

21

4

8

0.16

16

4

3

0.06

6

4

11

0.11

11

5

5

0.1

10

5

13

0.26

26

5

18

0.18

18

6

14

0.28

28

6

11

0.22

22

6

25

0.25

25

Total

50

1

100

Total

50

1

100

Total

100

1

100

Prediction

Prediction

Prediction

Hit

8

0.16

16

Hit

6

0.12

12

Hit

14

0.14

14

Miss

42

0.84

84

Miss

44

0.88

88

Miss

86

0.86

86

Total

50

1

100

Total

50

1

100

Total

100

1

100

Samples

Samples

Pooled

#5

#6

56

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

6

0.12

12

1

12

0.24

24

1

18

0.18

18

2

8

0.16

16

2

8

0.16

16

2

16

0.16

16

3

7

0.14

14

3

4

0.08

8

3

11

0.11

11

4

11

0.22

22

4

6

0.12

12

4

17

0.17

17

5

12

0.24

24

5

9

0.18

18

5

21

0.21

21

6

6

0.12

12

6

11

0.22

22

6

17

0.17

17

Total

50

1

100

Total

50

1

100

Total

100

1

100

Prediction

Prediction

Prediction

Hit

7

0.14

14

Hit

9

0.18

18

Hit

16

0.16

16

Miss

43

0.86

86

Miss

41

0.82

82

Miss

84

0.84

84

Total

50

1

100

Total

50

1

100

Total

100

1

100

Pooled

Pooled

Pooled

135

246

123456

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

Face Value

Count

Proportion

Percent

1

19

0.126667

12.667

1

25

0.166667

16.667

1

44

0.146667

14.667

2

19

0.126667

12.667

2

20

0.133333

13.333

2

39

0.13

13

3

33

0.22

22

3

23

0.153333

15.333

3

56

0.186667

18.667

4

28

0.186667

18.667

4

16

0.106667

10.667

4

44

0.146667

14.667

5

23

0.153333

15.333

5

38

0.253333

25.333

5

61

0.203333

20.333

6

28

0.186667

18.667

6

28

0.186667

18.667

6

56

0.186667

18.667

Total

150

1

100

Total

150

1

100

Total

300

1

100

Prediction

Prediction

Prediction

Hit

25

0.166667

16.667

Hit

23

0.153333

15.333

Hit

48

0.16

16

Miss

125

0.833333

83.333

Miss

127

0.846667

84.667

Miss

252

0.84

84

Total

150

1

100

Total

150

1

100

Total

300

1

100

 

 

In the fair die model for this case, in long runs of tosses of the die: approximately 16⅔% of tosses show “1”, approximately 16⅔% of tosses show “2”, approximately 16⅔% of tosses show “3”, approximately 16⅔% of tosses show “4”, approximately 16⅔% of tosses show “5”, and approximately 16⅔% of tosses show “6.” The sample data are generally compatible with a fair die assumption (equally-likely face values) and with a baseline expected prediction success rate of (1/6), or 16⅔%. Sample performance seems to improve with increasing sample size – but the samples do not exactly fit the fair assumption.

Sample versus Fair Model

8:00

Face Value 1: 14.67% (Sample) versus 16.67% (Fair Model)

Face Value 2: 13% (Sample) versus 16.67% (Fair Model)

Face Value 3: 18.67% (Sample) versus 16.67% (Fair Model)

Face Value 4: 14.67% (Sample) versus 16.67% (Fair Model)

Face Value 5: 20.33% (Sample) versus 16.67% (Fair Model)

Face Value 6: 18.67% (Sample) versus 16.67% (Fair Model)

 

Prediction “Hit”: 16% (Sample) versus 16.67% (Fair Model)

Prediction “Miss”: 84% (Sample) versus 83.33% (Fair Model))

 

Case Study 1.1: A Color Bowl

In random sampling, we might get a complete list of colors - we'd need a total sample (census) for that kind of listing. The sample proportions of each listed color approximate the corresponding model proportion in the bowl itself. In census sampling, every object in the bowl is counted. The listing is complete, and the model proportions may be calculated directly.

The basic idea in case study 1.1 is that random samples give imperfect pictures of what is being sampled. However, with sufficiently large samples, these samples can reliably yield good pictures of the processes or populations being sampled. And the essence of many statistical applications is the study of selected processes or populations. For a sense of the efficiency of the samples, compare sample and true percentages.

Process

We have a four color bowl, with blue, green, red and yellow marbles. Prior to each draw from the bowl, the bowl is thoroughly mixed, giving each resident marble an approximately equal chance of selection. After mixing, a blind (made without looking into the bowl) draw of a single marble is made. The group notes the color of the marble, and the marble is returned to the bowl – this is sampling with replacement. The mixing makes the sampling random. Each group produces a sample of 50 tosses.

Each cell corresponds to a single draw with replacement from the bowl.

Sample Grid (n=50)

0

0

0

0

0

0

0

0

0

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Sample results are tabulated in the form below.

Table – Draws with Replacement

Color

Count

Blue

 

Green

 

Red

 

Yellow

 

Total

 

 

Samples from the Color Bowl

Here are the six samples from our groups. You should be able to begin with the counts in the table and work out the proportions and percentages.

6:30 Samples

Color Bowl I - Sampling with Replacement

#1

#2

Pooled 12

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

20

20/50=0.4

100*.4=40

Blue

25

0.5

50

Blue

20+25=45

45/100=0.45

45

42.9

Green

12

12/50=0.24

100*.24=24

Green

11

0.22

22

Green

12+11=23

23/100=0.23

23

21.4

Red

15

15/50=0.3

100*.3=30

Red

5

0.1

10

Red

15+5=20

20/100=0.2

20

21.4

Yellow

3

3/50=0.06

100*.06=6

Yellow

9

0.18

18

Yellow

3+9=12

12/100=0.12

12

14.3

Total

50

1

100

Total

50

1

100

Total

100

1

100

100

#3

#4

Pooled 34

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

27

0.54

54

Blue

21

0.42

42

Blue

48

0.48

48

42.9

Green

7

0.14

14

Green

13

0.26

26

Green

20

0.2

20

21.4

Red

11

0.22

22

Red

9

0.18

18

Red

20

0.2

20

21.4

Yellow

5

0.1

10

Yellow

7

0.14

14

Yellow

12

0.12

12

14.3

Total

50

1

100

Total

50

1

100

Total

100

1

100

100

#5

#6

Pooled 56

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

22

0.44

44

Blue

26

0.52

52

Blue

48

0.48

48

42.9

Green

9

0.18

18

Green

11

0.22

22

Green

20

0.2

20

21.4

Red

12

0.24

24

Red

11

0.22

22

Red

23

0.23

23

21.4

Yellow

7

0.14

14

Yellow

2

0.04

4

Yellow

9

0.09

9

14.3

Total

50

1

100

Total

50

1

100

Total

100

1

100

100

Pooled 135

Pooled 246

Pooled All

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

69

0.46

46

Blue

72

0.48

48

Blue

141

0.47

47

42.9

Green

28

0.1866667

18.6667

Green

35

0.2333333

23.3333

Green

63

0.21

21

21.4

Red

38

0.2533333

25.3333

Red

25

0.1666667

16.6667

Red

63

0.21

21

21.4

Yellow

15

0.1

10

Yellow

18

0.12

12

Yellow

33

0.11

11

14.3

Total

150

1

100

Total

150

1

100

Total

300

1

100

100

 

Blue: 47% (Sample) versus 42.9% (Model)

Green: 21% (Sample) versus 21.4% (Model)

Red: 21% (Sample) versus 21.4% (Model)

Yellow: 11% (Sample) versus 14.3% (Model)

8:00 Samples

Color Bowl I - Sampling with Replacement

#1

#2

Pooled 12

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

15

0.3

30

Blue

7

0.14

14

Blue

22

0.22

22

25

Green

15

0.3

30

Green

23

0.46

46

Green

38

0.38

38

35.714286

Red

12

0.24

24

Red

12

0.24

24

Red

24

0.24

24

25

Yellow

8

0.16

16

Yellow

8

0.16

16

Yellow

16

0.16

16

14.285714

Total

50

1

100

Total

50

1

100

Total

100

1

100

100

#3

#4

Pooled 34

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

16

0.32

32

Blue

9

0.18

18

Blue

25

0.25

25

25

Green

16

0.32

32

Green

20

0.4

40

Green

36

0.36

36

35.714286

Red

10

0.2

20

Red

11

0.22

22

Red

21

0.21

21

25

Yellow

8

0.16

16

Yellow

10

0.2

20

Yellow

18

0.18

18

14.285714

Total

50

1

100

Total

50

1

100

Total

100

1

100

100

#5

#6

Pooled 56

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

14

0.28

28

Blue

12

0.24

24

Blue

26

0.26

26

25

Green

14

0.28

28

Green

20

0.4

40

Green

34

0.34

34

35.714286

Red

13

0.26

26

Red

11

0.22

22

Red

24

0.24

24

25

Yellow

9

0.18

18

Yellow

7

0.14

14

Yellow

16

0.16

16

14.285714

Total

50

1

100

Total

50

1

100

Total

100

1

100

100

Pooled 135

Pooled 246

Pooled All

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Color

Count

Proportion

Percent

Truth

Blue

45

0.3

30

Blue

28

0.1866667

18.667

Blue

73

0.243333

24.333

25

Green

45

0.3

30

Green

63

0.42

42

Green

108

0.36

36

35.714286

Red

35

0.2333333

23.333

Red

34

0.2266667

22.667

Red

69

0.23

23

25

Yellow

25

0.1666667

16.667

Yellow

25

0.1666667

16.667

Yellow

50

0.166667

16.667

14.285714

Total

150

1

100

Total

150

1

100

Total

300

1

100

100

 

Blue: 24.3% (Sample) versus 25% (Model)

Green: 36% (Sample) versus 35.7% (Model)

Red: 23% (Sample) versus 25% (Model)

Yellow: 16.7% (Sample) versus 14.3% (Model)

 

Some Formulas – Proportions, Percentages, Counts

The class represents some property or attribute, for example, blue, or red. Each member, or unit, of a sample can be classified – the result of the classification of the unit is the unit’s class.

Sample Proportion (p)

nclass ~ number of units of sample in class

ntotal ~ total number of units in sample

pclass = nclass / ntotal

pclass ~ proportion of sample in class

 

Sample Percent (pct)

nclass ~ number of units of sample in class

ntotal ~ total number of units in sample

pclass = nclass / ntotal

pctclass = 100*(nclass / ntotal)

pctclass = 100* pclass

pctclass ~ percent of sample in class

 

Population Proportion (P)

Nclass ~ number of units of population in class

Ntotal ~ total number of units in population

Pclass = Nclass / Ntotal

Pclass ~ proportion of population in class

 

Population Percent (PCT)

Nclass ~ number of units of population in class

Ntotal ~ total number of units in population

Pclass = Nclass / Ntotal

PCTclass = 100*(Nclass / Ntotal)

PCTclass = 100* Pclass

PCTclass ~ percent of population in class

 

In this setting,

 

nblue ~ number of blue draws in sample

ntotal ~ total number of draws per sample

pblue = nblue / ntotal

pblue ~ proportion of sample draws showing blue

pctblue = 100*pblue

pctblue ~ percent of sample draws showing blue

 

Nblue ~ number of blue marbles in bowl

Ntotal ~ total number of marbles in bowl

Pblue = Nblue / Nblue

Pblue ~ proportion of marbles in bowl that are blue

 

ngreen ~ number of green draws in sample

ntotal ~ total number of draws per sample

pgreen = ngreen / ngreen

pgreen ~ proportion of sample draws showing green

pctgreen = 100*pgreen

pctgreen ~ percent of sample draws showing green

 

Ngreen ~ number of green marbles in bowl

Ntotal ~ total number of marbles in bowl

Pgreen = Ngreen / Ngreen

Pgreen ~ proportion of marbles in bowl that are green

 

nred ~ number of red draws in sample

ntotal ~ total number of draws per sample

pred = nred / nred

pred ~ proportion of sample draws showing red

pctred = 100*pred

pctred ~ percent of sample draws showing red

 

Nred ~ number of red marbles in bowl

Ntotal ~ total number of marbles in bowl

Pred = Nred / Nred

Pred ~ proportion of marbles in bowl that are red

 

nyellow ~ number of yellow draws in sample

ntotal ~ total number of draws per sample

pyellow = nyellow / nyellow

pyellow ~ proportion of sample draws showing yellow

pctyellow = 100*pyellow

pctyellow ~ percent of sample draws showing yellow

 

Nyellow ~ number of yellow marbles in bowl

Ntotal ~ total number of marbles in bowl

Pyellow = Nyellow / Nyellow

Pyellow ~ proportion of marbles in bowl that are yellow

 

The True State of the Bowl

The 6:30 Bowl

Color

Count

Proportion

Percent

Blue

12

12/28 = 0.4285714

42.8571

Green

6

6/28 = 0.2142857

21.4286

Red

6

6/28 = 0.2142857

21.4286

Yellow

4

4/28 = 0.1428571

14.2857

Total

28

1

100

The true proportions are probabilities:

Pr{Blue Shows} =

In long runs of draws with replacement from the bowl, approximately 42.8  percent of draws with replacement from the bowl  show blue.

 

Pr{Green Shows} =

In long runs of draws with replacement from the bowl, approximately 21.4  percent of draws with replacement from the bowl  show green.

 

Pr{Red Shows} =

In long runs of draws with replacement from the bowl, approximately 21.4  percent of draws with replacement from the bowl  show red.

 

Pr{Yellow Shows} =

In long runs of draws with replacement from the bowl, approximately 14.3  percent of draws with replacement from the bowl  show yellow.

The 8:00 Bowl

Color

Count

Proportion

Percent

Blue

7

7/28 = 0.25

25

Green

10

10/28 = 0.3571429

35.714

Red

7

7/28 = 0.25

25

Yellow

4

4/28 = 0.1428571

14.286

Total

28

1

100

 

 

The true proportions are probabilities:

Pr{Blue Shows} =

In long runs of draws with replacement from the bowl, approximately 25  percent of draws with replacement from the bowl  show blue.

 

Pr{Green Shows} =

In long runs of draws with replacement from the bowl, approximately 35.7  percent of draws with replacement from the bowl  show green.

 

Pr{Red Shows} =

In long runs of draws with replacement from the bowl, approximately 25  percent of draws with replacement from the bowl  show red.

 

Pr{Yellow Shows} =

In long runs of draws with replacement from the bowl, approximately 14.3  percent of draws with replacement from the bowl  show yellow.

We see reasonable, but not exact matches between the sample proportions and the probabilities (P).

6:30

Blue: 47% (Sample) versus 42.9% (Model)

Green: 21% (Sample) versus 21.4% (Model)

Red: 21% (Sample) versus 21.4% (Model)

Yellow: 11% (Sample) versus 14.3% (Model)

8:00

Blue: 24.3% (Sample) versus 25% (Model)

Green: 36% (Sample) versus 35.7% (Model)

Red: 23% (Sample) versus 25% (Model)

Yellow: 16.7% (Sample) versus 14.3% (Model)

 

We didn’t get to these, so read up on Ellsberg I and Ellsberg II.

Regarding Ellsberg I 

The 1st Game: The first bowl is 50%/50% split between blue and green. The best we can do is break even, regardless of strategy. The simplest strategy involves picking one of the colors and always betting on that color.

The 2nd Game: The second bowl is an unknown composite of red and yellow. We might be able to win this game if 1) there is a dominant color and 2) we can determine that dominant color. A simple strategy here is to pick one color and ride it for awhile. Then stop betting and check the number of winning bets. If the color being betted is losing on a regular basis, switch colors.

The 3rd Game: This game only makes sense if the second bowl is dominant in red, bet on red – if red consistently shows, stay on the second bowl. Otherwise, either stop playing, or stick with the first bowl.

Regarding Ellsberg II

The 1st Game: The first bowl is 20% red / 40% black / 40% white. The simplest strategy involves picking one of the colors and always betting on that color. Regardless of betting choice, there is a 40% chance of losing for the single bet, and 20% for getting kicked off the game. 

The 2nd Game: The second bowl is 20% red / 80% black or white. The simplest strategy involves picking one of the colors and always betting on that color. If either white or black is sufficiently dominant, this game might be worth playing. The problem is that regardless of the possible advantage in the white/black part of the bowl, there is still a 20% chance of getting killed (permanently losing). But to detect this advantage, one is forced to pick a betting color (white or black) and spend some money.

The idea underlying the Ellsberg games is to illustrate the concept of making decisions about selected processes or populations by making decisions using random samples.