Low Discprecancy Disk Sampling

Qausi-Monte Carlo integration have been used in graphics for many decades. To perform such integration we need to generate sample points across the domain of integration. It turns out that we can improve the efficiency of the integration process by choosing these sample points in clever ways. Often we want to generate sample points that have low discrepancy (few lumps and few gaps) without being too regular.

Probability theory provides us with methods to sample pseudo-randomly and uniformly from a given domain. However, the inherent randomness of such techniques often leads to undesirable "lumps" or "holes" in the set of sample points. We can use low discrepancy sequences (LDS) to solve this issue. But exactly how do we combine probability theory and low discrepancy sequences in practice? In this post I will explain how to devise an algorithm that generates low discrepancy irregular sample sets on the unit disk. The approach presented should generalize to more complex sets such as the unit sphere in 3D. The target audience is programmers with basic understanding of probability theory and an interest in low discrepancy sampling.

This document is split into two parts. In section 1 I will use the inversion method to derive a formula for doing uniform random sampling on the unit disk. If you already know how to use this method, you may skip to section 2 where I show how to sample quasi-randomly with low discrepancy using the golden ratio.

1. Sampling Uniformly On The Unit Disk

Before we get to low discrepancy sequences, we need to find a way to random sample positions on the unit disk uniformly. You may guess that we can do that by simply choosing radius $r$ and $\theta$ uniformly, i.e. $r(u_1) = u_1, \quad \theta(u_2) = 2\pi u_2\tag{1}$ where $u_1, u_2$ are independent realizations of a uniform random variable on $[0,1]$ . However, this turns out to be wrong as shown in figure 1.

Fig 1: Naive guess vs. proper uniform sampling.

The left side of equation 1 shows the result of sampling each polar component uniformly and independently. Compare this to the right side of equation 1 which shows proper uniform sampling of the disk. From the figure, it is clear that the naive guess generates points that tend to lump together near $(0, 0)$ .

To do it correctly we can use the inversion method. In the following I will use this method but I will not explain the general idea behind it. If you don't know this method, I recommend that you read about it in the PBRT book. The high-level idea is that we want to derive a pair of cumulative distribution functions (CDFs) and then use canonical uniform random variables with these CDFs to enable uniform sampling over the unit disk. This process involves several steps which I will now go through.

Since we want to sample uniformly with respect to area, we know that the PDF $p(x, y)$ must be constant, i.e. $p(x, y) = c$ for some $c \in \mathbb{R}$ . We also know that the area of the unit disk $D$ is $\pi$ and therefore $\iint \limits_{D} p(x, y) \, \dif{x} \dif{y} = c \iint \limits_{D} \dif{x} \dif{y} = c \pi.$

Since $p(x, y)$ is a probability distribution it has to integrate to 1, and therefore $1 = \iint \limits_{D} p(x, y) \, \dif{x} \dif{y} = c \pi \iff c = \frac{1}{\pi}.$

Thus $p(x, y) = c = 1/\pi$ .

The inversion method involves integration over our domain $D$ . This is generally difficult to do in Cartesian coordinates, so we will convert $p(x, y)$ to polar coordinates. Converting polar coordinates to Cartesian coordinates is done via the mapping $T\left(\begin{bmatrix}r \\ \theta \end{bmatrix}\right) = \begin{bmatrix} x(r, \theta) \\ y(r, \theta) \end{bmatrix} = \begin{bmatrix} r \cos \theta \\ r \sin \theta \end{bmatrix}. \tag{2}$

Probability theory tells us that $p(r, \theta) = |J_T(r, \theta)| p(x, y)\tag{3}$ where $|J_T(r, \theta)| = \left| \begin{bmatrix} \frac{\partial x}{\partial r} & \frac{\partial x}{\partial \theta} \\ \frac{\partial y}{\partial r} & \frac{\partial y}{\partial \theta} \end{bmatrix} \right|. \tag{4}$ is the determinant of the Jacobian matrix of $T$ . The determinant of the Jacobian encodes how much tiny areas in $(r, \theta)$ space is "stretched" compared to imathuivalent tiny areas in Cartesian $(x, y)$ space. To understand why determinants quantifies stretching like this, I recommend this video by 3Blue1Brown. This stretching property is the reason we in equation 3 use $|J_T|$ as a "conversion factor" between the distributions $p(r, \theta)$ and $p(x, y)$ .

Using equation 2 and equation 4 we can calculate that $|J_T(r, \theta)| = \left| \begin{matrix} \cos \theta & -r \sin \theta \\ \sin \theta & r \cos \theta \end{matrix} \right| = r \cos^2 \theta + r \sin^2 \theta = r (\cos^2 \theta + \sin^2 \theta) = r,$ which we can plug into equation 3 to see that $p(r, \theta) = r \, p(x, y) = \frac{r}{\pi}.$

The next step of the inversion method is the compute the marginal distribution $p_r(r)$ and the conditional distribution $p_\theta(\theta | r)$ . If you are unfamiliar with these concepts, I recommend that you read up on them before continuing. Note that here we could have chosen to go for $p_\theta(\theta)$ and $p_r(r | \theta)$ instead, the final result should be the same regardless. We can calculate $p_r(r)$ by integrating out $\theta$ , i.e. $p_r(r) = \int_0^{2\pi} p(r, \theta) \, \dif{\theta} = \frac{r}{\pi} \int_0^{2\pi} \, \dif{\theta} = 2r.$

By definition of conditional probability we get that $p_\theta(\theta | r) = \frac{p(r, \theta)}{p_r(r)} = \frac{1}{2\pi}.$

Notice that both distributions do not depend on $\theta$ . This makes sense if you consider the symmetry of the disk. Since we are working with a uniform distribution a probability density should not depend on the angle $\theta$ .

We are now ready to calculate the CDFs $P(r)$ and $P(\theta | r)$ . By the definition of the CDF, we find these by integrating the PDFs, i.e. $P_r(r) = \int_0^r p_r(r') \, \dif{r'} = \int_0^r 2r' \, \dif{r'} = \left. r'^2 \right\rvert_0^r = r^2,\tag{5}$ $P_\theta(\theta | r) = \int_0^\theta p_\theta(r, \theta') \, \dif{\theta'} = \int_0^\theta \frac{1}{2\pi} \, \dif{\theta'} = \left. \frac{\theta'}{2\pi}\right\rvert_0^\theta = \frac{\theta}{2\pi}. \tag{6}$

The final step of the inversion method, is to find the inverses $P_r^{-1}$ and $P_\theta^{-1}$ of the CDFs. Inversion a function requires that the function is strictly monotonically increasing. This is the case for our CDFs since the corresponding PDFs are never zero. We can find the inverse functions $P_r^{-1}$ and $P_\theta^{-1}$ by applying them to equation 5 and equation 6 respectively: $P_r^{-1}(P(r)) = P_r^{-1}(r^2) \iff r = P_r^{-1}(r^2) \iff P_r^{-1}(r) = \sqrt{r},$ $P_\theta^{-1}(P(\theta) = P_\theta^{-1}\left( \frac{\theta}{2\pi} \right) \iff \theta = P_\theta^{-1}\left( \frac{\theta}{2\pi} \right) \iff P_\theta^{-1}(\theta) = 2 \pi \theta.$

By the inversion method this means that we can now sample the unit disk via $r(u_1) = P_r^{-1}(u_1) = \sqrt{u_1}, \quad \theta(u_2) = P_\theta^{-1}(u_2) = 2\pi u_2\tag{7}$ where $u_1, u_2$ are realizations of independent uniform random variables on $[0, 1]$ . It is interesting to compare these new expression with the naive guess from equation 1. The only difference is that the $r$ component is now $\sqrt{u_1}$ rather than just $u_1$ . If you look at the plot for $\sqrt{x}$ on $[0, 1]$ you can see that the square root in some sense "pushes" its input values upwards toward 1, e.g. $\sqrt{(0.5)} \approx 0,71$ . Figuratively this means that our sample points on the unit disk will be pushed away from the origin. This sounds reasonable if you consider how the sample points lumped together near the origin in figure 1.

Let's test it out! The left plot of figure 2 shows some realizations of $(u_1, u_2)$ . The right plot shows how these realizations map to the disk using the adjusted $r(u_1)$ and $\theta(u_2)$ . The tendency to lump together near the origin from figure 1 is gone as we hoped.

Fig 2: Uniform distribution (left) mapped to the disk (right).

We still have some lumping but at least the lumps themselves are now somewhat distributed over the domain. The lumping occurs only because there are lumps in the input, i.e. the random variable realizations shown on the left of figure 2. We can solve this problem by replacing uniform sampling with low discrepancy sequences.

2. Applying Low Discrepancy Sequences

In the previous section we used probability theory to derive a sampling scheme that is correct mathematically. In this section we will see that usage of low discrepancy sequences can be a bit more like an art than a rigorous mathematical procedure (although mathematics do play an important role).

The inversion method yielded two functions $r(u_1), \theta(u_2)$ that take as input pairs of uniform random variables $(u_1, u_2)$ and return sample points that are uniformly distributed on the disk (see equation 7). But the inherent "lumping" of uniformly sampled random variables carried over to the points on the disk as shown in figure 2. To fix this, how about we just deterministically choose some input points $(x_i, y_i)$ that are spread out and have no lumps in them? An obvious input to try is a regular grid of points (since these are spread out and have no lumps). I show this experiment in figure 3. On the left you see a plot of a grid of $12 \times 12$ points $(x_i, y_i)$ and on the right we have the corresponding points $(r(x_i), \theta(y_i))$ mapped to the disk. By comparing this with figure 2, we see that the random lumping is gone and instead replaced with regular spokes. Note how each spoke corresponds to a row of points in the plot on the left.

Fig 3: A regular grid of points (left) mapped to the unit disk (right).

How can we avoid these spokes? The problem is of course that all $12 \cdot 12 = 144$ input points $(x_i, y_i)$ only have 12 distinct $y_i$ values and thus also 12 distinct $\theta(y_i)$ values. We can solve this with irrational numbers. In the regular grid in figure 3 I chose $y_i = (i / 12) \mod 1$ . If we instead use $y_i = (i \cdot k) \mod 1\tag{8}$ where $k$ is an irrational number, we get something like in figure 4 where I used $k = \pi$ . Since $\pi$ is irrational all $y_i$ values will be distinct. In fact the sequence $y_i$ is an actual low discrepancy sequence if $k$ is chosen wisely. In the plot of $(x_i, y_i)$ on the left side of figure 4 we see that the sample points are now slightly skewed in the vertical direction. These slight differences in $y_i$ cause slight differences in $\theta(y_i)$ . The result of this is that spokes become "curled" as shown on the right of figure 4.

Fig 4: Inputs created via irrational numbers cause the spokes to "curl".

The next step is to choose a "better" value for $k$ so that we can get rid of the spokes. The following may sound weird if you hear it for the first time, but mathematicians have found that some irrational numbers are more irrational than others. Numbers that are only slightly irrational (like $\pi$ ) will yield only a minor curl (see figure 4). As the "degree of irrationality" of $k$ increases, the curling effect gets stronger until the point where the spokes mix into each other and thus seem to vanish (at least for the human eye). For a visualization of this mathematical phenomenon and its relation to "degrees of irrationality", I recommend this video by Numberphile.

The most irrational number is the golden ratio, i.e. $\phi = \frac{1+\sqrt{5}}{2} = 1.618 \ldots.$

In figure 5 you can see what happens if we use $k = \phi$ in equation 8. With $k = \phi$ the curling effect is so strong that the spokes have blended together and thereby disappeared (technically they are still there, we just don't perceive them).

Fig 5: Using the golden ratio to generate input points.

The polar plot is starting to look pretty good now. The only remaining problem is that the input points $(x_i, y_i)$ still only have 12 distinct $x_i$ values. For example this is the reason we see a circle of points near the origin on the right side of figure 5. The resolution to this is fairly simple. We simple set $x_i = i / N$ where $N$ is the number of sample points. This essentially causes $x_i$ and therefore also $r(x_i) = \sqrt{x_i}$ to increase gradually for all points. I show the final result in figure 6. I hope you agree that this is what we want. The points on the disk are now evenly spread out, there's no gaps, and it is irregular.

Fig 6: The final version has no gaps or holes, yet it is still irregular.

4. Code

Below I include the Python scripts used for the generation of the plots in this document. I used Python 3.7.0 and matplotlib 3.0.2.

import numpy as np
from numpy import pi, cos, sin, sqrt, arange
import matplotlib.pyplot as plt

# For determinism.
plt.rcParams['svg.hashsalt'] = 42
np.random.seed(0)

dot_size = 20
N_sqrt = 12
N = 12*12
golden_ratio = (1 + 5**0.5) / 2.0

def xs(rs, thetas):
  return rs*cos(thetas)

def ys(rs, thetas):
  return rs*sin(thetas)

def draw_subplot(idx, xs, ys, ylim, xlim, viz):
  plt.subplot(1, 2, idx, aspect = 'equal')
  plt.ylim(ylim)
  plt.xlim(xlim)
  plt.scatter(xs, ys, s = dot_size)

  if viz:
    theta = np.linspace(-np.pi, np.pi, 100)
    plt.plot(np.sin(theta), np.cos(theta), color = 'red')
    plt.scatter([0], [0], color = 'red', s = dot_size)

# Create two plots side by side.
def gen2(
  xs1, ys1, circle_viz1, ylim1, xlim1,
  xs2, ys2, circle_viz2, ylim2, xlim2,
  filename
):
  print("Creating " + filename)
  draw_subplot(1, xs1, ys1, ylim1, xlim1, circle_viz1)
  draw_subplot(2, xs2, ys2, ylim2, xlim2, circle_viz2)
  plt.tight_layout()
  plt.savefig(filename, bbox_inches = 'tight')
  plt.clf()

def gen2_cart_polar(us, filename):
  rs = np.sqrt(us[0])
  thetas = 2 * pi * us[1]
  gen2(
    us[0], us[1], False, (0, 1), (0, 1),
    xs(rs, thetas), ys(rs, thetas), True, (-1, 1), (-1, 1),
    filename
  )

def gen_naive():
  n = 400
  rs_uniform = np.random.uniform(0, 1, n)
  thetas_uniform = np.random.uniform(-np.pi, np.pi, n)
  rs_sqrooted = np.sqrt(rs_uniform)
  gen2(
    xs(rs_uniform, thetas_uniform), ys(rs_uniform, thetas_uniform), True, (-1, 1), (-1, 1),
    xs(rs_sqrooted, thetas_uniform), ys(rs_sqrooted, thetas_uniform), True, (-1, 1), (-1, 1),
    'naive.svg'
  )

def gen_uniform():
  us = np.random.uniform(0, 1, (N, N))
  gen2_cart_polar(us, 'uniform.svg')

def gen_regular():
  indices = arange(0, N, dtype=float)
  us = [
    np.floor(indices / N_sqrt) / N_sqrt + 0.5/N_sqrt,
    np.mod(indices, N_sqrt) / N_sqrt + 0.5/N_sqrt
  ]
  gen2_cart_polar(us, 'regular.svg')

def gen_irrational1():
  indices = arange(0, N, dtype=float)
  us = [
    np.floor(indices / N_sqrt) / N_sqrt + 0.5/N_sqrt,
    np.mod(indices * np.pi, 1.0)
  ]
  gen2_cart_polar(us, 'irrational1.svg')

def gen_golden_ratio1():
  indices = arange(0, N, dtype=float)
  us = [
    np.floor(indices / N_sqrt) / N_sqrt + 0.5/N_sqrt,
    np.mod(indices * golden_ratio, 1.0)
  ]
  gen2_cart_polar(us, 'golden_ratio1.svg')

def gen_final():
  indices = arange(0, N, dtype=float)
  us = [
    (indices + 0.5) / N, # push by 0.5 to "center" sample point
    np.mod(indices * golden_ratio, 1.0)
  ]
  gen2_cart_polar(us, 'final.svg')

gen_naive()
gen_uniform()
gen_regular()
gen_irrational1()
gen_golden_ratio1()
gen_final()

Listing 1: Generation of 2D disk plots.

import numpy as np
from numpy import pi, cos, sin, sqrt, arange
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def gen_spherical():
  # Create a sphere
  r = 1
  N = 800
  golden_ratio = (1 + 5**0.5) / 2.0
  indices = arange(0, N, dtype=float)
  # us = np.random.uniform(0, 1, (500, 500))
  us = [
    (indices + 0.5) / N, # push by 0.5 to "center" sample point
    np.mod(indices * golden_ratio, 1.0)
  ]

  pi = np.pi
  cos = np.cos
  sin = np.sin
  phi = np.arccos(1 - 2 * us[0])
  theta = 2 * np.pi * us[1]
  x = r*sin(phi)*cos(theta)
  y = r*sin(phi)*sin(theta)
  z = r*cos(phi)

  fig = plt.figure()
  ax = fig.add_subplot(111, projection='3d')
  ax.scatter(x, y, z, s=40)

  ax.set_xlim([-1,1])
  ax.set_ylim([-1,1])
  ax.set_zlim([-1,1])
  ax.set_aspect("equal")
  plt.tight_layout()
  plt.savefig("spherical.svg", bbox_inches = 'tight')

gen_spherical()

Listing 2: Generation of 3D spherical plot.

Low Discprecancy Disk Sampling

1. Sampling Uniformly On The Unit Disk

2. Applying Low Discrepancy Sequences

3. Conclusion

4. Code