Wolff sampling and critical slowdown

One of the main reasons we study the Ising model is the existence of a critical point in the phase diagram where spontaneous magnetization occurs. The critical temperature for this phase transition is given by \[k T_c = \frac{2 \lvert J\rvert}{\ln(1+\sqrt{2})} \approx 2.27 \lvert J\rvert\,.\]

Near the critical point, algorithms like Metropolis that flip only single spins at a time begin to suffer from a phenomenon called critical slowing down. This is where the time needed to generate statistically independent samples by repeating the update rule starts to grow faster than the size of the lattice near the critical point.

Recall that the average correlation function is defined as \[C(x,y) = \frac{1}{N^2}\sum_{i,j}s_{i,j} s_{i+x,j+y} - \avg{M}^2\,,\] where \(\avg{M}\) is the average magnetization density. If we change variables to radial coordinates, and further average over the angular variable, then we can put this in terms of a distance \(r\). This radial correlation function \(C(r)\) can be parameterized in the following manner in the limit of a large lattice (and \(r \gg 1\)): \[C(r) \sim \frac{\e^{-r/\xi}}{r^{d-2+\eta}}\,.\] Here \(\xi\) is called the correlation length, \(d\) is the dimension of the model (2 for the 2D Ising model), and \(\eta\) is called the critical exponent (\(\eta = 1/4\) for the 2D Ising model).

Near the critical point, the correlation length starts to diverge, and the exponential decay of correlations starts to become merely an algebraic, long-range decay. A specific prediction from renormalization theory is that \[\xi \sim \frac{1}{\lvert T-T_c \rvert^\nu}\,\] where \(\nu = 1\) for the 2D Ising model. This divergence and the long-range correlations that result cause difficulty in simulating physics near the critical point.

The integrated autocorrelation time \(\tau\) is a measure of this difficulty. Suppose we have an order parameter, i.e. a quantity like average magnetization density in the 2D Ising model, which quantifies the phase transition. After a suitably long simulation time, this quantity will eventually reach its equilibrium value. This time scale can be expressed as \[ \tau = \sum_{t=0}^\infty\frac{\bigl(\avg{M(t)M(0)} - \avg{M(0)}^2\bigr) }{\bigl(\avg{M(0)^2} - \avg{M(0)}^2\bigr)}\,.\] Near the critical point, and for an infinite system, this quantitiy too diverges \[ \tau \sim \xi^z \sim \frac{1}{\lvert T-T_c \rvert^{z\nu}}\,. \] The quantity \(z\) is called the dynamical critical exponent associated to the observable \(M\), and \(z=2\) for the magnetization in the 2D Ising model.

Now consider a simulation of the Ising model on a finite \(N \times N\) grid. The correlation length cannot be greater than \(N\) for this system. This means that the characteristic timescale for equilibration is about \[\tau \sim N^{z \nu} = N^2\,.\]

For an algorithm like Metropolis sampling that only flips single spins at a time, this is a disaster. Because Metropolis is a local algorithm, it can only move a spin configuration by a constant number of lattice sites for every \(N^2\) iterations. But because of the long-range correlations, we need to move about \(N\) sites before reaching a new, uncorrelated configuration. This is roughly a random walk, so it should take an additional \(N^2\) time steps to travel this distance. Thus, sampling independent spin configurations takes time roughly \(N^4\) for a local algorithm like Metropolis.

Cluster flip algorithms

An obvious idea is to try to flip many spins at once. However, it isn't clear that this can be done while still fulfilling the key ingredients of a successful Monte Carlo algorithm: it should satisfy the conditions of ergodicity, the Markov property, and detailed balance. It turns out that such a method does exist, and it is due to Wolff, who built on initial ideas of Swedson and Wang.

The Wolff cluster flip algorithm is the following:

  1. Choose a spin at random and remember its direction.
  2. For each neighboring spin, if it is parallel to the seed spin, add it with probability \(p\).
  3. For each newly added neighbor, repeat 2 recursively until the cluster stops growing.
  4. Flip all the spins in the cluster.
  5. Repeat.

For the very special value of \(p = 1-\e^{-2J/kT}\), the Wolff algorithm satisfies the detailed balance condition with Boltzmann weights. At the level of single clusters, it is Markovian. And any spin can flip, so it is ergodic. Therefore, it has the Boltzmann probability distribution as its unique stationary distribution.

The advantage of the Wolff algorithm is that the correlation time is much shorter near the critical point. Because the Wolff algorithm makes nonlocal moves, it can change to a statistically independent configuration after very few moves.

More fun with simulations

Here are some more exercises for you to explore, including some that we did with Metropolis sampling. Download the latest version of ising.zip and try to answer the questions below.

Perfect sampling and coupling from the past

There is another method to deal with the long autocorrelation times for certain families of critical models, including the ferromagnetic Ising model. Ideally, we would like to obtain a "perfect" sample from the stationary distribution, but this would seem to involve running our Markov chain for an infinite amount of time. A beautiful idea due to Propp and Wilson showed that we can in fact sample perfectly if we run our Markov chain starting in the distanct past and go back to the future!

This sounds fantastical at first, and maybe even nonsensical. But we can break it down into a few simple ideas, each of which is well motivated, and together the Propp-Wilson algorithm for perfect sampling emerges naturally.