Coding the Monty Hall Problem

In [1]:
import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import scipy
from IPython.display import IFrame

On the drive to work this morning, I had 'shower thought'. "how tricky would it be to code up a demonstration of the Monty Hall problem?". It'd had been floating around my head since I re-watched a film called "21", which follows the exploits of the a group of gifted mathematics students, led by their professor, as they try to bleed Vegas Casino's dry, by counting cards. One scene used to display the main characters mathematical prowess features the Monty Hall problem. See below for details.

In [2]:
IFrame(width="560", height="315",
       src="https://www.youtube.com/embed/iBdjqtR2iK4?start=71")
Out[2]:

In summary, imagine you're on a game-show and there are 3 closed doors. Behind one door is a car (or other desirable item) and behind the other two doors are goats (or other not desirable item. You're asked to choose a door, but before you open it, the game show host opens another door which contains a goat, and then asks do you want to switch doors. The Monty Hall problem asks whether it's in your interest to switch doors after the location of one goat is revealed.

Indeed, if you never switch your chance of success is 33.33%, however, initially counter-intuitively, if you change your position, your chance of success is 50%. Let's do some simulations.

P.S. There's a lot of history surrounding this problem, and trying to explain it in a notebook wouldn't do it justice, Vox has a good video on it, if your interested.

For the trivial case where a player picks a door at random and never changes their choice after being shown a goat, we expect a success probability of 0.33.

We'll do 100,000 simulations per case.

In [3]:
# we will treat 1 as the car and 0 as a goat
# these are the only 3 arrangements of the goats and cars
arr1 = [0, 0, 1]
arr2 = [1, 0, 0]
arr3 = [0, 1, 0]
In [4]:
check = 0
results_player1 = []
while check <= 100000:

    # this is the arrangement of items behind doors for this iteration of the sim
    this_arr = np.array(random.choice([arr1, arr2, arr3]))

    # this is the choice the player has made
    this_choice = random.choice([0, 1, 2])

    # determine if the player wins or not
    if this_arr[this_choice] == 1:
        results_player1.append(True)
    else:
        results_player1.append(False)

    # increment the counter by 1
    check += 1
In [5]:
# what is the percentage of wins for the player who randomly chooses and never changes?
player1_percen = np.sum(results_player1) / 100000*100
player1_percen
Out[5]:
33.373000000000005

Looks good. What about the non-trivial case where the player always changes their choice after being shown a goat?

In [6]:
check = 0
results_player2 = []
while check <= 100000:

    # this is the arrangement of items behind doors for this iteration of the sim
    this_arr = np.array(random.choice([arr1, arr2, arr3]))

    # this is the choice the player has made
    this_choice = random.choice([0, 1, 2])

    # this is the index of the winning door
    winning_ix = np.where(this_arr == 1)

    # these are the indices of the goats
    remaining_duds = list((set({0, 1, 2}) - {winning_ix[0][0]}))
    
    # this is the index of the revealed dud
    revealed_dud = np.random.choice(remaining_duds)

    # the final choice for this player becomes the door that wasn't picked
    # initially and wasn't opened by the host
    final_choice = list(set({0, 1, 2}) - {this_choice} - {revealed_dud})[0]

    # determine if the player wins or not
    if this_arr[final_choice] == 1:
        results_player2.append(True)
    else:
        results_player2.append(False)

    # increment the counter by 1
    check += 1
In [7]:
# what is the percentage of wins for the player who always changes their mind?
player2_percen = np.sum(results_player2) / 100000*100
player2_percen
Out[7]:
49.945

Woop! Final case, what happens if after the reveal of a goat, our player decides to change door only 50% of the time. A priori, what do I expect the result to be? Somewhere between this first two results? So...

In [8]:
def midpoint(p1, p2):
    return p1 + (p2-p1)/2     # or *0.5


midpoint(player1_percen, player2_percen)
Out[8]:
41.659000000000006

But we know those values should be 33.33 and 50...

In [9]:
midpoint(33.33, 50)
Out[9]:
41.665

So, my guess is 41.665

Player 3

Some thoughts about player 3

This player is interesting and there's a couple ways to code them. We can have in the simplest case the player flipping a coin (i.e. choosing at random 0 or 1) and using that to push them down one of two paths. Or we could implement a parameter which is the frequency at which they change their mind (could also be stick to their choice, but I'm going with an affirmative action as changing doors.

This gives us a question to solve, how do we want to implement the variability of the choice. This feel like a dice-based problem. Flipping a coin is just like rolling a 'two-sided dice'. If we want the player to change only 25% of the time, we assign them a four sided dice, 16.6% would be a six-sided die, etc. So, we're looking for a way of implementing an arbitrary system for scaling the 'die'.

We could just say that if we're doing 100000 simulations and we want to see the result for 25% chance of changing door, then we can implement method 2 for the first 25000 simulations, and method 1 for the rest. But that doesn't feel like a good way to implement it.

First on the list of things to do then is create a method of selecting one of two variables at an uneven distribution. Numpy has us covered here, but lets just run some simulations to make sure it's doing what we want.

In [10]:
change_mind_freq = 0.5
check = 0
counts = []
while check <= 10000:
    counts.append(np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0])
    check += 1

np.sum(counts) / len(counts)
Out[10]:
0.49735026497350265
In [11]:
change_mind_freq = 0.3
check = 0
counts = []
while check <= 10000:
    counts.append(np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0])
    check += 1

np.sum(counts) / len(counts)
Out[11]:
0.2947705229477052
In [12]:
change_mind_freq = 0.1
check = 0
counts = []
while check <= 10000:
    counts.append(np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0])
    check += 1

np.sum(counts) / len(counts)
Out[12]:
0.1014898510148985

Perfect! Now lets implement this into the game.

In [13]:
change_mind_freq = 0.5
check = 0
results_player3 = []
while check <= 100000:

    # if change_mind is one our player will change their mind
    # if it's 0 they wont
    change_mind = np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0]

    if change_mind == 0:

        this_arr = np.array(random.choice([arr1, arr2, arr3]))

        this_choice = random.choice([0, 1, 2])

        if this_arr[this_choice] == 1:
            results_player3.append(True)
        else:
            results_player3.append(False)

        check += 1

    if change_mind == 1:

        this_arr = np.array(random.choice([arr1, arr2, arr3]))

        this_choice = random.choice([0, 1, 2])

        winning_ix = np.where(this_arr == 1)

        remaining_duds = list((set({0, 1, 2}) - {winning_ix[0][0]}))

        revealed_dud = np.random.choice(remaining_duds)

        final_choice = list(set({0, 1, 2}) - {this_choice} - {revealed_dud})[0]

        if this_arr[final_choice] == 1:
            results_player3.append(True)
        else:
            results_player3.append(False)

        check += 1
In [14]:
# what is the percentage of wins for the player who always changes their mind 50% of
# the time?
player3_percen = np.sum(results_player3) / 100000 * 100
player3_percen
Out[14]:
41.699999999999996

Call it a lucky guess I suppose? What's interesting about this loop is that it makes the first two functions redundant, because we can set change_mind_freq to 0 in the first case where our player never changes their mind and 1 in the case where they always change their mind. Let's write it into a function then.

In [15]:
def montyhall(nsim, change_mind_freq):
    """Run a simulation of the Monty Hall Problem with varying probabilities that
    player will change their mind.

    Parameters
    ----------
    nsim : int
        The number of simulations to run.
    change_mind_freq: int
        The probability that the player will change their mind and choose the 
        alternate door. 

    Returns
    ----------
    percen_win : The percentage of games won by the player (wins/nsim)

    Notes
    ----------
    If you want the player to always change their mind then set change_mind_freq to
    1, the converse is true, if you want the player to never change their mind, set
    change_mind_freq to 0."""

    # check the inputs
    assert isinstance(nsim, int), "nsim should be a positive integer > 0"
    assert nsim > 0, "nsim needs to be a postive integer > 0"
    assert change_mind_freq >= 0 and change_mind_freq <= 1, "change_mind_freq should be a value between >=0 and <=1"

    check = 0
    results = []

    arr1 = [0, 0, 1]
    arr2 = [1, 0, 0]
    arr3 = [0, 1, 0]

    while check <= nsim:

        change_mind = np.random.choice(
            [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0]

        if change_mind == 0:

            this_arr = np.array(random.choice([arr1, arr2, arr3]))

            this_choice = random.choice([0, 1, 2])

            if this_arr[this_choice] == 1:
                results.append(True)
            else:
                results.append(False)

            check += 1

        if change_mind == 1:

            this_arr = np.array(random.choice([arr1, arr2, arr3]))

            this_choice = random.choice([0, 1, 2])

            winning_ix = np.where(this_arr == 1)

            remaining_duds = list((set({0, 1, 2}) - {winning_ix[0][0]}))

            revealed_dud = np.random.choice(remaining_duds)

            final_choice = list(
                set({0, 1, 2}) - {this_choice} - {revealed_dud})[0]

            if this_arr[final_choice] == 1:
                results.append(True)
            else:
                results.append(False)

            check += 1

    percen_win = np.sum(results) / nsim*100

    return percen_win
In [16]:
montyhall(100000, 1)
Out[16]:
50.07899999999999
In [17]:
montyhall(100000, 0)
Out[17]:
33.355000000000004
In [18]:
montyhall(100000, 0.5)
Out[18]:
41.802

Looking great. Now that we have it generalised it's fairly easy to plot the success over the changing probabilities.

In [19]:
# define a range of probabilities for the varying change_mind_freq arguement
probs = np.linspace(0, 1, 50)
In [20]:
# run 100,000 simulations for each probability value
success = []
for prob in probs:
    success.append(montyhall(100000, prob))
In [21]:
# helper function
def abline(slope, intercept):
    """Plot a line from slope and intercept"""
    axes = plt.gca()
    x_vals = np.array(axes.get_xlim())
    y_vals = intercept + slope * x_vals
    plt.plot(x_vals, y_vals, '--', color='orange')
In [22]:
# plot the results over changing change_mind_frequencies
plt.figure(figsize=(12, 8))
plt.plot(success, probs)
plt.ylabel('Probability')
plt.xlabel('Success Rate (%)')
plt.show()

...and finally, seeing as how we have a nice line, let's see what the function of that line is.

In [23]:
slope, intercept, rvalue, pvalue, stderr = scipy.stats.linregress(
    success, probs)

'The model fitted with a slope of {}, an intercept of {}, with an R value of (p={})'.format(
    slope, intercept, rvalue, pvalue)
Out[23]:
'The model fitted with a slope of 0.05950140578677803, an intercept of -1.9793783580666426, with an R value of (p=0.9994339194610821)'

Let's update that graph

In [24]:
plt.figure(figsize=(12, 8))
plt.plot(success, probs)
plt.ylabel('Probability')
plt.xlabel('Success Rate (%)')
abline(slope, intercept)
plt.title('Y = {}x + {}   R = {}'.format(slope, intercept, rvalue))
plt.show()

To estimate success rate of a given probability we can use x = (y - c) / m

In [25]:
predicted = []
for prob in probs:
    predicted.append(((prob - intercept) / slope))
In [26]:
master_results = pd.DataFrame(
    {'success': success, 'predicted': predicted, 'probability': probs},
    columns=['probability', 'predicted', 'success'])
master_results['difference'] = master_results['predicted'] - \
    master_results['success']
In [27]:
master_results
Out[27]:
probability predicted success difference
0 0.000000 33.266077 33.046 0.220077
1 0.020408 33.609063 33.957 -0.347937
2 0.040816 33.952050 33.827 0.125050
3 0.061224 34.295036 34.324 -0.028964
4 0.081633 34.638022 34.423 0.215022
5 0.102041 34.981008 35.048 -0.066992
6 0.122449 35.323995 35.420 -0.096005
7 0.142857 35.666981 35.750 -0.083019
8 0.163265 36.009967 35.867 0.142967
9 0.183673 36.352953 36.391 -0.038047
10 0.204082 36.695940 36.803 -0.107060
11 0.224490 37.038926 36.887 0.151926
12 0.244898 37.381912 37.407 -0.025088
13 0.265306 37.724898 37.813 -0.088102
14 0.285714 38.067885 38.169 -0.101115
15 0.306122 38.410871 38.387 0.023871
16 0.326531 38.753857 38.998 -0.244143
17 0.346939 39.096843 39.010 0.086843
18 0.367347 39.439829 39.511 -0.071171
19 0.387755 39.782816 39.542 0.240816
20 0.408163 40.125802 40.172 -0.046198
21 0.428571 40.468788 40.687 -0.218212
22 0.448980 40.811774 40.950 -0.138226
23 0.469388 41.154761 41.096 0.058761
24 0.489796 41.497747 41.209 0.288747
25 0.510204 41.840733 42.158 -0.317267
26 0.530612 42.183719 42.469 -0.285281
27 0.551020 42.526706 42.441 0.085706
28 0.571429 42.869692 42.630 0.239692
29 0.591837 43.212678 42.971 0.241678
30 0.612245 43.555664 43.698 -0.142336
31 0.632653 43.898651 43.934 -0.035349
32 0.653061 44.241637 44.159 0.082637
33 0.673469 44.584623 44.382 0.202623
34 0.693878 44.927609 44.959 -0.031391
35 0.714286 45.270595 44.997 0.273595
36 0.734694 45.613582 45.482 0.131582
37 0.755102 45.956568 45.955 0.001568
38 0.775510 46.299554 46.296 0.003554
39 0.795918 46.642540 46.545 0.097540
40 0.816327 46.985527 46.868 0.117527
41 0.836735 47.328513 47.628 -0.299487
42 0.857143 47.671499 47.737 -0.065501
43 0.877551 48.014485 47.961 0.053485
44 0.897959 48.357472 48.762 -0.404528
45 0.918367 48.700458 48.686 0.014458
46 0.938776 49.043444 48.972 0.071444
47 0.959184 49.386430 49.346 0.040430
48 0.979592 49.729417 49.769 -0.039583
49 1.000000 50.072403 49.963 0.109403

An interesting coding problem to be sure, not as tricky as I initially imagined it would be. Hopefully this is interesting to someone. As always, criticism and discussion of the methods I've used to answer this problem are entirely welcome. Also, if you know of any other cool coding problems, feel free to send them my way.

Happy Coding,

Sean