Coding the Monty Hall Problem¶

import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import scipy
from IPython.display import IFrame

On the drive to work this morning, I had 'shower thought'. "how tricky would it be to code up a demonstration of the Monty Hall problem?". It'd had been floating around my head since I re-watched a film called "21", which follows the exploits of the a group of gifted mathematics students, led by their professor, as they try to bleed Vegas Casino's dry, by counting cards. One scene used to display the main characters mathematical prowess features the Monty Hall problem. See below for details.

IFrame(width="560", height="315",
       src="https://www.youtube.com/embed/iBdjqtR2iK4?start=71")

In summary, imagine you're on a game-show and there are 3 closed doors. Behind one door is a car (or other desirable item) and behind the other two doors are goats (or other not desirable item. You're asked to choose a door, but before you open it, the game show host opens another door which contains a goat, and then asks do you want to switch doors. The Monty Hall problem asks whether it's in your interest to switch doors after the location of one goat is revealed.

Indeed, if you never switch your chance of success is 33.33%, however, initially counter-intuitively, if you change your position, your chance of success is 50%. Let's do some simulations.

P.S. There's a lot of history surrounding this problem, and trying to explain it in a notebook wouldn't do it justice, Vox has a good video on it, if your interested.

For the trivial case where a player picks a door at random and never changes their choice after being shown a goat, we expect a success probability of 0.33.

We'll do 100,000 simulations per case.

# we will treat 1 as the car and 0 as a goat
# these are the only 3 arrangements of the goats and cars
arr1 = [0, 0, 1]
arr2 = [1, 0, 0]
arr3 = [0, 1, 0]

check = 0
results_player1 = []
while check <= 100000:

    # this is the arrangement of items behind doors for this iteration of the sim
    this_arr = np.array(random.choice([arr1, arr2, arr3]))

    # this is the choice the player has made
    this_choice = random.choice([0, 1, 2])

    # determine if the player wins or not
    if this_arr[this_choice] == 1:
        results_player1.append(True)
    else:
        results_player1.append(False)

    # increment the counter by 1
    check += 1

# what is the percentage of wins for the player who randomly chooses and never changes?
player1_percen = np.sum(results_player1) / 100000*100
player1_percen

33.373000000000005

Looks good. What about the non-trivial case where the player always changes their choice after being shown a goat?

check = 0
results_player2 = []
while check <= 100000:

    # this is the arrangement of items behind doors for this iteration of the sim
    this_arr = np.array(random.choice([arr1, arr2, arr3]))

    # this is the choice the player has made
    this_choice = random.choice([0, 1, 2])

    # this is the index of the winning door
    winning_ix = np.where(this_arr == 1)

    # these are the indices of the goats
    remaining_duds = list((set({0, 1, 2}) - {winning_ix[0][0]}))
    
    # this is the index of the revealed dud
    revealed_dud = np.random.choice(remaining_duds)

    # the final choice for this player becomes the door that wasn't picked
    # initially and wasn't opened by the host
    final_choice = list(set({0, 1, 2}) - {this_choice} - {revealed_dud})[0]

    # determine if the player wins or not
    if this_arr[final_choice] == 1:
        results_player2.append(True)
    else:
        results_player2.append(False)

    # increment the counter by 1
    check += 1

# what is the percentage of wins for the player who always changes their mind?
player2_percen = np.sum(results_player2) / 100000*100
player2_percen

49.945

Woop! Final case, what happens if after the reveal of a goat, our player decides to change door only 50% of the time. A priori, what do I expect the result to be? Somewhere between this first two results? So...

def midpoint(p1, p2):
    return p1 + (p2-p1)/2     # or *0.5


midpoint(player1_percen, player2_percen)

41.659000000000006

But we know those values should be 33.33 and 50...

midpoint(33.33, 50)

41.665

So, my guess is 41.665

Player 3¶

Some thoughts about player 3

This player is interesting and there's a couple ways to code them. We can have in the simplest case the player flipping a coin (i.e. choosing at random 0 or 1) and using that to push them down one of two paths. Or we could implement a parameter which is the frequency at which they change their mind (could also be stick to their choice, but I'm going with an affirmative action as changing doors.

This gives us a question to solve, how do we want to implement the variability of the choice. This feel like a dice-based problem. Flipping a coin is just like rolling a 'two-sided dice'. If we want the player to change only 25% of the time, we assign them a four sided dice, 16.6% would be a six-sided die, etc. So, we're looking for a way of implementing an arbitrary system for scaling the 'die'.

We could just say that if we're doing 100000 simulations and we want to see the result for 25% chance of changing door, then we can implement method 2 for the first 25000 simulations, and method 1 for the rest. But that doesn't feel like a good way to implement it.

First on the list of things to do then is create a method of selecting one of two variables at an uneven distribution. Numpy has us covered here, but lets just run some simulations to make sure it's doing what we want.

change_mind_freq = 0.5
check = 0
counts = []
while check <= 10000:
    counts.append(np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0])
    check += 1

np.sum(counts) / len(counts)

0.49735026497350265

change_mind_freq = 0.3
check = 0
counts = []
while check <= 10000:
    counts.append(np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0])
    check += 1

np.sum(counts) / len(counts)

0.2947705229477052

change_mind_freq = 0.1
check = 0
counts = []
while check <= 10000:
    counts.append(np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0])
    check += 1

np.sum(counts) / len(counts)

0.1014898510148985

Perfect! Now lets implement this into the game.

change_mind_freq = 0.5
check = 0
results_player3 = []
while check <= 100000:

    # if change_mind is one our player will change their mind
    # if it's 0 they wont
    change_mind = np.random.choice(
        [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0]

    if change_mind == 0:

        this_arr = np.array(random.choice([arr1, arr2, arr3]))

        this_choice = random.choice([0, 1, 2])

        if this_arr[this_choice] == 1:
            results_player3.append(True)
        else:
            results_player3.append(False)

        check += 1

    if change_mind == 1:

        this_arr = np.array(random.choice([arr1, arr2, arr3]))

        this_choice = random.choice([0, 1, 2])

        winning_ix = np.where(this_arr == 1)

        remaining_duds = list((set({0, 1, 2}) - {winning_ix[0][0]}))

        revealed_dud = np.random.choice(remaining_duds)

        final_choice = list(set({0, 1, 2}) - {this_choice} - {revealed_dud})[0]

        if this_arr[final_choice] == 1:
            results_player3.append(True)
        else:
            results_player3.append(False)

        check += 1

# what is the percentage of wins for the player who always changes their mind 50% of
# the time?
player3_percen = np.sum(results_player3) / 100000 * 100
player3_percen

41.699999999999996

Call it a lucky guess I suppose? What's interesting about this loop is that it makes the first two functions redundant, because we can set change_mind_freq to 0 in the first case where our player never changes their mind and 1 in the case where they always change their mind. Let's write it into a function then.

def montyhall(nsim, change_mind_freq):
    """Run a simulation of the Monty Hall Problem with varying probabilities that
    player will change their mind.

    Parameters
    ----------
    nsim : int
        The number of simulations to run.
    change_mind_freq: int
        The probability that the player will change their mind and choose the 
        alternate door. 

    Returns
    ----------
    percen_win : The percentage of games won by the player (wins/nsim)

    Notes
    ----------
    If you want the player to always change their mind then set change_mind_freq to
    1, the converse is true, if you want the player to never change their mind, set
    change_mind_freq to 0."""

    # check the inputs
    assert isinstance(nsim, int), "nsim should be a positive integer > 0"
    assert nsim > 0, "nsim needs to be a postive integer > 0"
    assert change_mind_freq >= 0 and change_mind_freq <= 1, "change_mind_freq should be a value between >=0 and <=1"

    check = 0
    results = []

    arr1 = [0, 0, 1]
    arr2 = [1, 0, 0]
    arr3 = [0, 1, 0]

    while check <= nsim:

        change_mind = np.random.choice(
            [0, 1], 1, p=[1-change_mind_freq, change_mind_freq])[0]

        if change_mind == 0:

            this_arr = np.array(random.choice([arr1, arr2, arr3]))

            this_choice = random.choice([0, 1, 2])

            if this_arr[this_choice] == 1:
                results.append(True)
            else:
                results.append(False)

            check += 1

        if change_mind == 1:

            this_arr = np.array(random.choice([arr1, arr2, arr3]))

            this_choice = random.choice([0, 1, 2])

            winning_ix = np.where(this_arr == 1)

            remaining_duds = list((set({0, 1, 2}) - {winning_ix[0][0]}))

            revealed_dud = np.random.choice(remaining_duds)

            final_choice = list(
                set({0, 1, 2}) - {this_choice} - {revealed_dud})[0]

            if this_arr[final_choice] == 1:
                results.append(True)
            else:
                results.append(False)

            check += 1

    percen_win = np.sum(results) / nsim*100

    return percen_win

montyhall(100000, 1)

50.07899999999999

montyhall(100000, 0)

33.355000000000004

montyhall(100000, 0.5)

41.802

Looking great. Now that we have it generalised it's fairly easy to plot the success over the changing probabilities.

# define a range of probabilities for the varying change_mind_freq arguement
probs = np.linspace(0, 1, 50)

# run 100,000 simulations for each probability value
success = []
for prob in probs:
    success.append(montyhall(100000, prob))

# helper function
def abline(slope, intercept):
    """Plot a line from slope and intercept"""
    axes = plt.gca()
    x_vals = np.array(axes.get_xlim())
    y_vals = intercept + slope * x_vals
    plt.plot(x_vals, y_vals, '--', color='orange')

# plot the results over changing change_mind_frequencies
plt.figure(figsize=(12, 8))
plt.plot(success, probs)
plt.ylabel('Probability')
plt.xlabel('Success Rate (%)')
plt.show()

...and finally, seeing as how we have a nice line, let's see what the function of that line is.

slope, intercept, rvalue, pvalue, stderr = scipy.stats.linregress(
    success, probs)

'The model fitted with a slope of {}, an intercept of {}, with an R value of (p={})'.format(
    slope, intercept, rvalue, pvalue)

'The model fitted with a slope of 0.05950140578677803, an intercept of -1.9793783580666426, with an R value of (p=0.9994339194610821)'

Let's update that graph

plt.figure(figsize=(12, 8))
plt.plot(success, probs)
plt.ylabel('Probability')
plt.xlabel('Success Rate (%)')
abline(slope, intercept)
plt.title('Y = {}x + {}   R = {}'.format(slope, intercept, rvalue))
plt.show()

To estimate success rate of a given probability we can use x = (y - c) / m

predicted = []
for prob in probs:
    predicted.append(((prob - intercept) / slope))

master_results = pd.DataFrame(
    {'success': success, 'predicted': predicted, 'probability': probs},
    columns=['probability', 'predicted', 'success'])
master_results['difference'] = master_results['predicted'] - \
    master_results['success']

master_results

An interesting coding problem to be sure, not as tricky as I initially imagined it would be. Hopefully this is interesting to someone. As always, criticism and discussion of the methods I've used to answer this problem are entirely welcome. Also, if you know of any other cool coding problems, feel free to send them my way.

Happy Coding,

Sean

	probability	predicted	success	difference
0	0.000000	33.266077	33.046	0.220077
1	0.020408	33.609063	33.957	-0.347937
2	0.040816	33.952050	33.827	0.125050
3	0.061224	34.295036	34.324	-0.028964
4	0.081633	34.638022	34.423	0.215022
5	0.102041	34.981008	35.048	-0.066992
6	0.122449	35.323995	35.420	-0.096005
7	0.142857	35.666981	35.750	-0.083019
8	0.163265	36.009967	35.867	0.142967
9	0.183673	36.352953	36.391	-0.038047
10	0.204082	36.695940	36.803	-0.107060
11	0.224490	37.038926	36.887	0.151926
12	0.244898	37.381912	37.407	-0.025088
13	0.265306	37.724898	37.813	-0.088102
14	0.285714	38.067885	38.169	-0.101115
15	0.306122	38.410871	38.387	0.023871
16	0.326531	38.753857	38.998	-0.244143
17	0.346939	39.096843	39.010	0.086843
18	0.367347	39.439829	39.511	-0.071171
19	0.387755	39.782816	39.542	0.240816
20	0.408163	40.125802	40.172	-0.046198
21	0.428571	40.468788	40.687	-0.218212
22	0.448980	40.811774	40.950	-0.138226
23	0.469388	41.154761	41.096	0.058761
24	0.489796	41.497747	41.209	0.288747
25	0.510204	41.840733	42.158	-0.317267
26	0.530612	42.183719	42.469	-0.285281
27	0.551020	42.526706	42.441	0.085706
28	0.571429	42.869692	42.630	0.239692
29	0.591837	43.212678	42.971	0.241678
30	0.612245	43.555664	43.698	-0.142336
31	0.632653	43.898651	43.934	-0.035349
32	0.653061	44.241637	44.159	0.082637
33	0.673469	44.584623	44.382	0.202623
34	0.693878	44.927609	44.959	-0.031391
35	0.714286	45.270595	44.997	0.273595
36	0.734694	45.613582	45.482	0.131582
37	0.755102	45.956568	45.955	0.001568
38	0.775510	46.299554	46.296	0.003554
39	0.795918	46.642540	46.545	0.097540
40	0.816327	46.985527	46.868	0.117527
41	0.836735	47.328513	47.628	-0.299487
42	0.857143	47.671499	47.737	-0.065501
43	0.877551	48.014485	47.961	0.053485
44	0.897959	48.357472	48.762	-0.404528
45	0.918367	48.700458	48.686	0.014458
46	0.938776	49.043444	48.972	0.071444
47	0.959184	49.386430	49.346	0.040430
48	0.979592	49.729417	49.769	-0.039583
49	1.000000	50.072403	49.963	0.109403