Monty Hall Problem

The Monty Hall problem is one of those brain teasers which has confused me on the three previous occasions that I read about it.  The (many) articles on line explaining it didn’t resonate with me, so I decided to work it through again and write some code to prove it.

Once I’d written the code I had an epiphany and realised how incredibly simple it was, so this blog entry is me explaining the Monty Hall problem and why probability works out how it does.  Hopefully, if you’re like me and you’ve struggled with this before, this blog will make it simple and clear.

I’m not going to explain the Monty Hall problem, because Wikipedia does an excellent job. If you’re not familiar, see you in a few minutes!

A brief summary of the incorrect thought process is as follows:

  1. There are three doors: X, Y and Z. 
  2. The contestant picks door X.  There was a 1/3 chance of the contestant picking this door.  There was a 2/3 chance of then NOT picking door X.
  3. Door Y is revealed to be a goat, therefore the Cadillac is behind door X or door Z.
  4. The Cadillac is behind door X or door Z, i.e. there are only two permutations of goat, Cadillac and doors there it’s a 50/50 chance of finding the Cadillac.

So, why given a seemingly 50/50 chance of either having a goat or a Cadillac behind the two remaining doors is the probably of finding the Cadillac on a switch 2/3?

Let’s start with some simple code that emulates the problem.  This code is deliberately naive to make it easy to follow.

using System;
using System.Security.Cryptography;

namespace MontyHall
{
    internal class Program
    {
        private static Random _randomGenerator = new Random();

        private static void Main(string[] args)
        {
            Console.WriteLine("{0,-12}{1,12}{2,12}", "Iterations", "Stick Wins", "Switch Wins");

            for (int i = 0; i < 10; i++)
            {
                int iterations = 1000000;
                int winsIfStickCount = EmulateGame(iterations);
                int winsIfSwitchCount = iterations - winsIfStickCount;
                Console.WriteLine("{0,-12}{1,12}{2,12}", iterations, winsIfStickCount, winsIfSwitchCount);
            }
        }


        private static int EmulateGame(int iterations)
        {
            int winIfStickCount = 0;
            for (int i = 0; i < iterations; i++)
            {
                if (EmulateGame())
                {
                    winIfStickCount++;
                }
            }
            return winIfStickCount;
        }


        private static bool EmulateGame()
        {
            // Create 3 boxes, each with a goat
            string[] boxes = new[] { "Goat", "Goat", "Goat" };

            // Replace a goat with a Cadillac, randomly
            
            boxes[_randomGenerator.Next(3)] = "Cadillac";

            // Now pick a random door
            int randomDoor = _randomGenerator.Next(3);

            // Now Monty opens a door containing a goat
            int knownGoat = -1;
            while (knownGoat == -1)
            {
                /* Keep picking a door at random until Monty finds one that the contenstant didn't pick
                 * and contains a goat.  This isn't efficient, but it keeps the code very simple */
                int montysDoor = _randomGenerator.Next(3);
                if (montysDoor == randomDoor || boxes[montysDoor] == "Cadillac") continue;
                knownGoat = montysDoor;
            }

            /* If we originally picked the Cadillac, then we won.
             * If we didn't, then as Monty has revealed the other goat the
             * other door MUST contain the Cadillac
             */
            return (boxes[randomDoor] == "Cadillac");
        }
    }
}

Run the code, and ten million iterations of the Monty Hall problem will be run in batches of one million.  Here’s some sample output

Iterations    Stick Wins Switch Wins
1000000           333788      666212
1000000           333780      666220
1000000           332643      667357
1000000           333629      666371
1000000           333797      666203
1000000           333219      666781
1000000           333363      666637
1000000           332921      667079
1000000           333100      666900
1000000           333255      666745

Pretty convincing.  After 10,000,000 iterations we’re seeing a very definite pattern that switching doubles your chances of finding the Cadillac.

Why?

The reason why became obvious once I had realised that this chunk of code was totally superfluous to whether I won or not.

            // Now Monty opens a door containing a goat
            int knownGoat = -1;
            while (knownGoat == -1)
            {
                /* Keep picking a door at random until Monty finds one that the contenstant didn't pick
                 * and contains a goat.  This isn't efficient, but it keeps the code very simple */
                int montysDoor = _randomGenerator.Next(3);
                if (montysDoor == randomDoor || boxes[montysDoor] == "Cadillac") continue;
                knownGoat = montysDoor;
            }

The line of code that determines whether you win by sticking or switching is totally independent:

return (boxes[randomDoor] == "Cadillac");

Once you understand that this single, simple line is responsible for the probability of winning or losing, I think it becomes really easy to understand.  I.e.

  1. There is a 1/3 chance the contestant picks the Cadillac.
  2. There is a 2/3 chance that the contestant hasn’t picked the Cadillac.
  3. Therefore by switching there is a 2/3 chance of finding the Cadillac because if one of the two unpicked doors contains the Cadillac you are guaranteed to find it.

One of the difficulties with understanding this problem is the scale of the numbers involved.  It becomes much easier with bigger numbers:

Imagining trying to win the UK National Lottery draw jackpot.  The jackpot is won by picking 6 distinct numbers between 1 and 49 and matching them all.  There is a 1/13,983,816 chance of matching all 6 numbers with a single ticket (see http://lottery.merseyworld.com/Info/Chances.html).

  1. Imagine that you have 13,983,816 lottery tickets, each with a different set of numbers on.  That means that the winning lottery ticket is in there… somewhere.
  2. You pick a ticket at random.  There’s a 1/13,983,816 chance of finding the winning ticket.
  3. A magician arrives, waves his magic wand and all of the remaining tickets disappear EXCEPT for one.  The magician promises that this remaining ticket is the winning ticket unless you’ve already picked it up (remember there’s a 1/13,983,816 chance of that, so pretty small).
  4. The magician then offers you a swap: do you want to keep your ticket (which has a 1/13,983,816 chance of being the winning ticket), or take his ticket?

Still reckon it’s a 50/50 chance of winning the National Lottery?  Me neither.

This entry was posted in C#. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.