Approximating $\pi$ Using Coin Flips || Finite Ape

Introduction

There are many ways to compute $\pi$, and people have been working on finding even more ways for thousands of years. There are also many ways to do coding interviews but there hasn’t been enough emphasis on improving coding interviews, especially when hiring for roles outside of software engineering. So you can imagine my surprise a few years ago when I had a pleasant experience in a coding interview where they asked me to compute $\pi$ using only coin flips.

I saved my code at the end of the interview and I’ll go through it below. It is based on a nice geometric intuition and it provides a nice introduction to idea of using simulations, even though at its core, it is pretty simple.

High-level Explanation

Being asked to use coin flips to compute or more accurately, approximate $\pi$ was a bit surprising for me but it was a fun challenge. There are two obvious issues that need to be solved but if you stare at them for 10 seconds, it is easy to see the solution. First, coin flips generate discrete numbers. Second, they generate one-dimensional numbers. However, most of the intuitive ideas about computing $\pi$ (at least for me) involve some circle that lives in a two-dimensional “continuous” (uncountable) space. I’ll first describe how I would solve this problem, if I could generate two dimensional real vectors instead of outcomes of coin flips. Then, I will go back and generate those vectors from the “coins”.

From 2-D Random Vectors to $\pi$

Imagine drawing a square with sides of length 1 (in whatever unit you prefer) on the ground. Within the square, we also draw a circle with a diameter of 1.

The area of the square is 1 since each side is of length 1: $1 \times 1 = 1$.

The area of the circle is $\frac{\pi}{4}$ since the radius is $\frac{1}{2}$: $\pi \times \left(\frac{1}{2}\right)^2 = \frac{\pi}{4}$.

So, the ratio of the areas of the circle and the square is $\frac{\pi}{4}$.

Suppose we drop a pepple on the square (uniformly) randomly; it can stop in any point in the square with equal probability. If we drop enough pepples on the square (and if we can drop them randomly enough), we can expect the ratio of the numbers of pepples in the circle and square to be the same as the ratio of their areas. To see this, ignore the circle for a second and just imagine comparing the number of pepples in two halves of the square. There is no reason for one half to have more or less pepple than the other. The same is true for any two subsets of the square with the same surface area. So if we imagine small enough subsets such that we fit gazillion of them in the square, ratio of the subsets in the circle to the total number of subsets should be the same as the ratio of the areas of the circle and the square, which we know to be $\pi/4$. So, we could multiply the computed ratio by $4$ to find an approximation.

$n = 16$:

$\hat{\pi} = \frac{4 \cdot 4}{16} = 1$

$n = 36$:

$\hat{\pi} = \frac{4 \cdot 32}{36} = 3.56$

$n = 64$:

$\hat{\pi} = \frac{4 \cdot 52}{64} = 3.25$

As we can see from the above pictures and the calculations, even by just dividing our box into 64 little boxes, we get surprisingly close to $\pi$. In the actual simulation, we will use millions of them to improve the precision.

Code for generating these images (Click to Expand)

Here’s the code for generating these graphs. I totally asked GPT to write this part.

import numpy as np
import matplotlib.pyplot as plt


def create_numbered_subsquares_visualization(n=3, granularity=50):
    # Create a grid of points for the main squares
    x = np.linspace(-0.5, 0.5, n + 1)
    y = np.linspace(-0.5, 0.5, n + 1)

    # Calculate width of each main square
    square_width = 1 / n

    # Solarized colors
    # Solarized colors
    inside_color = solarized_colors["base03"] 
    outside_color = solarized_colors["red"]

    # Create a grid of center points for each square
    center_x = -0.5 + (np.arange(n) + 0.5) * square_width
    center_y = -0.5 + (np.arange(n) + 0.5) * square_width
    centers_xx, centers_yy = np.meshgrid(center_x, center_y)

    # Calculate distance from origin for each center
    centers_dist = np.sqrt(centers_xx ** 2 + centers_yy ** 2)
    inside_circle = centers_dist <= 0.5

    # Create the plot
    fig, ax = plt.subplots(figsize=(10, 10))

    # Plot each square with solid color representing if it's inside or outside the circle
    for i in range(n):
        for j in range(n):
            # Get bottom-left corner of the main square
            x0 = -0.5 + j * square_width
            y0 = -0.5 + i * square_width

            # Determine the color based on whether it's inside the circle
            color = inside_color if inside_circle[i, j] else outside_color

            # Draw the rectangle
            rect = plt.Rectangle((x0, y0), square_width, square_width, color=color)
            ax.add_patch(rect)

    # Add numbers to each square, counted separately for inside and outside
    inside_counter = 1
    outside_counter = 1
    inside_max_coords = None
    outside_max_coords = None
    for i in range(n - 1, -1, -1):  # Reverse the iteration to count downwards
        for j in range(n):
            cx = centers_xx[i, j]
            cy = centers_yy[i, j]

            if inside_circle[i, j]:
                plt.text(cx, cy, str(inside_counter),
                         ha='center', va='center', color='white', fontsize=12, fontweight='normal')
                inside_max_coords = (cx, cy)
                inside_counter += 1
            else:
                plt.text(cx, cy, str(outside_counter),
                         ha='center', va='center', color='white', fontsize=12, fontweight='normal')
                outside_max_coords = (cx, cy)
                outside_counter += 1

    # Underline the highest number for inside and outside
    if inside_max_coords:
        plt.text(inside_max_coords[0], inside_max_coords[1], str(inside_counter - 1),
                 ha='center', va='center', color='white', fontsize=12, fontweight='bold', bbox=dict(boxstyle='round,pad=0.3', edgecolor='white', facecolor='none'))
    if outside_max_coords:
        plt.text(outside_max_coords[0], outside_max_coords[1], str(outside_counter - 1),
                 ha='center', va='center', color='white', fontsize=12, fontweight='bold', bbox=dict(boxstyle='round,pad=0.3', edgecolor='white', facecolor='none'))

    # Add circle outline for reference
    theta = np.linspace(0, 2 * np.pi, 100)
    circle_x = 0.5 * np.cos(theta)
    circle_y = 0.5 * np.sin(theta)
    plt.plot(circle_x, circle_y, 'w-', linewidth=1, alpha=0.5)

    # Adjust plot properties
    ax.set_aspect('equal')
    ax.set_xlim(-0.5, 0.5)
    ax.set_ylim(-0.5, 0.5)

    # Remove ticks
    ax.set_xticks([])
    ax.set_yticks([])

    plt.show()


create_numbered_subsquares_visualization(n=24, granularity=100)

Now, since there are many such subsets, let’s ignore the possibility of two pepples falling into the same subset, assuming each subset is large enough to hold one pepple. Then, the ratio of randomly thrown pepples in the circle to the total number of pepples should also be $\pi/4$, right? Here is a visualization of these pepples, colored based on whether they are in the circle or not.

Digital pepples colored to highlight the circle

Of course, the coordinates of the pepples are our 2-D random vectors. Specifically, we generated them so that they are equally like to fall anywhere on the square. Hence, we draw random vectors from the uniform distribution over the unit square.

This method is a baby Monte Carlo simulation, a computational technique that utilizes randomness to solve problems that may even be deterministic in reality. Of course, this is just a very simple demonstration of a powerful method.¹

From Coin Flips to 2-D Vectors

The geometric explanation above lives in two-dimensions but we need to start smaller. Let’s first try to get a random number between 0 and 1 using coin flips.

Each flip can be thought of as generating one bit of a binary number. By flipping a coin multiple times, we can simulate the process of generating random numbers. These numbers, in turn, can be used to determine the coordinates of points within our square. The more coin flips we perform (or the more bits we generate), the more accurate our approximation of $\pi$ will become.

For instance, if we flip a coin only once, we either get 0 or 1 so it wouldn’t be granular enough. With 10 flips of the coin, there are $2^{10}$ possible outcomes. By assigning some meaning to each coin flip, we can slice the interval $[0, 1]$ into more than a thousand (exactly $2^{10} - 1$) tiny intervals.

Here is what I mean by assigning some meaning to coin flips. Let’s say “heads” mean “above” and “tails” mean “below”. So, the first flip can determine whether our number is above or below $\frac{1}{2}$. If the first one was heads, we know that we are generating a number in the interval $[0, \frac{1}{2}]$. The second flip would then determine whether our number is above or below $\frac{1}{4}$. We can keep doing this to get shorter intervals. After the last flip, we can simply choose either the midpoint of the interval or some other convention (like the lower- or upper-bound of the interval). We just obtained something that looks less discrete than coin flips using nothing but coin flips!

If we imagine repeating this process twice; once for vertical and once for horizontal dimentions of the square, we would be dividing the square into more than a million subsets!² Now we are ready to see how we can accomplish this in Python.

Implementation in Python

The implementation involves generating random pepples within the square and determining whether they fall within the circle. We’ll use Python’s numpy library for efficient numerical computations and matplotlib for visualization. Before we go any further, I want to emphasize the fact that this is in no way an optimized solution but I think there is some value in seeing a less polished version first. I have a faster version at the end if you want to actually run it for yourself.

Here’s a step-by-step breakdown:

Generating a Random Number with Coin Flips

import numpy as np

# generating a random number between 0 and 1 using coin tosses
def rand_num(k):
    coin_flips = np.random.randint(2, size=k)
    num = coin_flips.dot(2 ** np.arange(k)[::-1]) / (2**k - 1)
    return num


# generating a random sequence of n numbers between 0 and 1
def rand_seq(n, k):
    seq = [rand_num(k) for i in range(n)]
    return seq

# generating a random sequence of n points in the unit square
def points(n, k):
    x = rand_seq(n, k)
    y = rand_seq(n, k)
    points = np.column_stack((x, y))
    center = np.array([0.5, 0.5])
    points =  points - center
    return points

rand_num(k) generates a random number between 0 and 1 using $k$ coin flips.³

rand_seq(n, k) uses rand_num(k) to generate a sequence of $n$ random numbers between 0 and 1, using $k$ coin flips each. This function will be crucial in determining the coordinates of our random points within the square.

points(n, k) uses rand_seq(n, k) twice (once for each dimension) to generate $n$ points in the unit square and then recenters them so that the center is at the origin. Now, we have the function generate our pepples ready and all we need to do is to be able to count them!

Approximating $\pi$ Using “Points”

# estimating pi from n points in the unit square
def pi_from_coin_flips(n, k):

    pts = points(n, k)
    dists = np.sqrt(np.sum(pts**2, axis=1))
    in_circle = np.sum(dists <= 0.5)
    ratio = in_circle/n
    approx_pi = 4*ratio
    return approx_pi

# running the experiment m times and averaging the results
def average_pi(n, k, m):
    return np.mean([pi_from_coin_flips(n, k) for i in range(m)])

Here, pi_from_coin_flips(n, k) calls points(n, k) to create $n$ points as before. Then, it computes the distances from each point to the origin and counts the ones that are in the circle. As we know the total number of points, we can easily compute ratio of points in the circle and then, from there, get to the approximation of the $pi$.

Finally, average_pi(n, k, m) calls pi_from_coin_flips(n, k) multiple times to get multiple approximations of $\pi$ and then takes their average to get an even more accurate estimate of $\pi$.

Results

To get some results with different parameters, I ran the following code with the numbers below. Note that I used relatively small numbers each time.

Here are the approximations that I got from this code, rounded to 2 digits.⁴ As you can see, even with reasonably small numbers, we receive surprisingly accurate approximations. I wouldn’t trust these numbers to plan a trip to Erid or Mars, or even… the grocery store but it is still cool to see how easy and fast it is to do something like this yourself on an unremarkable computer these days.

Points	Runs	Approximation
10	10	3.36
10	100	3.22
10	1000	3.11
10	10000	3.13
100	10	3.16
100	100	3.14
100	1000	3.13
100	10000	3.13
1000	10	3.11
1000	100	3.13
1000	1000	3.14
1000	10000	3.14
10000	10	3.13
10000	100	3.14
10000	1000	3.14
10000	10000	3.14

Improving the Speed

If you are thinking there has to be a way to generate numbers in a faster way, you wouldn’t be alone. The following function could replace rand_seq function above. And, needless to say, even this is far from being optimal. However, in my causal testing, this was an order of magnitude faster than the previous version.⁵

# generating a random sequence of n numbers between 0 and 1
def faster_rand_seq(n, k):
    # get random 0s and 1s of shape (n, k)
    coin_flips = np.random.randint(2, size=(n, k))
    # bit shifting for speed
    powers = 1 << np.arange(k - 1, -1, -1)
    # efficient product and normalization
    nums_norm = coin_flips @ powers / ((1 << k) - 1)
    return nums_norm

In the collapsed section below, I have another function that does the entire thing both a bit more faster and also in less lines but it is less readable so I won’t go into its details.

Faster code in less lines (Click here to expand) (Click to Expand)

import numpy as np

# Combined function for estimating Pi
def faster_pi(n, k, m):
    # Generate random 0s and 1s of shape (2, m, n, k) at once, a bit memory intensive
    coin_flips = np.random.randint(2, size=(2, m, n, k), dtype=np.uint8)
    powers = 1 << np.arange(k - 1, -1, -1, dtype=np.uint64)
    coords = np.dot(coin_flips, powers) / ((1 << k) - 1)
    points = coords.transpose(1, 2, 0) - 0.5
    dists_squared = np.sum(points ** 2, axis=-1)
    in_circle = np.count_nonzero(dists_squared <= 0.25, axis=-1) 
    ratios = in_circle / n
    approx_pis = 4 * ratios
    return np.mean(approx_pis)

Conclusion

Approximating $\pi$ using coin flips is a fascinating example of how probability and computational techniques can provide insights into a problem that is fundamentally geometrical. This approach offers a unique perspective on calculating $\pi$ but also demonstrates the power and versatility of Monte Carlo simulations in solving complex problems. More importantly though, it is fun.

One note about random seeds. Outside of simulations, I almost always set the seed to 42 because everyone knows that it is the best random seed and also the answer to most important questions in life and universe. However, in this script, if I used a fixed seed, I would keep pulling the same numbers over and over again so it is actually a good thing that I didn’t fix the seed. A way to deal with this but still provide a reproducible code could be first getting $(2 \cdot n \cdot k \cdot m)$ random bits at once from a fixed seed and then dividing these into appropriate lengths to generate each number but as you can see, I did not do that.

One of the earliest applications of this method was the development of nuclear weapons. Turing’s Cathedral by George Dyson is a great book on the history of computation and computers. It has several chapters on how the work on both conventional and nuclear bombs contributed to the development of computers, and vice versa. ↩︎
In fact, $(2^{10}-1)^2 = 1046529$ such subsets. ↩︎
In the interview, the second line actually looked like num = int("".join(str(flip) for flip in coin_flips), 2)/(2**k -1). You might be wondering why I converted integers to a string and then back to floats. My only response would be that I had about 15 minutes to solve this problem in the interview. Yes, I did not get the job. ↩︎
Note that if you run the same code, you would get slightly different numbers. This is because I didn’t use a particular random seed and numpy pulls a fresh random seed using your computer every time you call it to generate random numbers in this case. See the last paragraph of the conclusion for more commentary on this. ↩︎
In fact, it was about 200 times faster, which shows how efficient some numpy operations are. ↩︎