# Wolff sampling and critical slowdown

One of the main reasons we study the Ising model is the existence of a *critical point* in the phase diagram where spontaneous magnetization occurs. The critical temperature for this phase transition is given by
\[k T_c = \frac{2 \lvert J\rvert}{\ln(1+\sqrt{2})} \approx 2.27 \lvert J\rvert\,.\]

Near the critical point, algorithms like Metropolis that flip only single spins at a time begin to suffer from a phenomenon called *critical slowing down*. This is where the time needed to generate statistically independent samples by repeating the update rule starts to grow faster than the size of the lattice near the critical point.

Recall that the average correlation function is defined as
\[C(x,y) = \frac{1}{N^2}\sum_{i,j}s_{i,j} s_{i+x,j+y} - \avg{M}^2\,,\]
where \(\avg{M}\) is the average magnetization density. If we change variables to radial coordinates, and further average over the angular variable, then we can put this in terms of a distance \(r\). This radial correlation function \(C(r)\) can be parameterized in the following manner in the limit of a large lattice (and \(r \gg 1\)):
\[C(r) \sim \frac{\e^{-r/\xi}}{r^{d-2+\eta}}\,.\]
Here \(\xi\) is called the *correlation length*, \(d\) is the dimension of the model (2 for the 2D Ising model), and \(\eta\) is called the *critical exponent* (\(\eta = 1/4\) for the 2D Ising model).

Near the critical point, the correlation length starts to diverge, and the exponential decay of correlations starts to become merely an *algebraic*, long-range decay. A specific prediction from renormalization theory is that
\[\xi \sim \frac{1}{\lvert T-T_c \rvert^\nu}\,\]
where \(\nu = 1\) for the 2D Ising model. This divergence and the long-range correlations that result cause difficulty in simulating physics near the critical point.

The *integrated autocorrelation time* \(\tau\) is a measure of this difficulty. Suppose we have an *order parameter*, i.e. a quantity like average magnetization density in the 2D Ising model, which quantifies the phase transition. After a suitably long simulation time, this quantity will eventually reach its equilibrium value. This time scale can be expressed as
\[ \tau = \sum_{t=0}^\infty\frac{\bigl(\avg{M(t)M(0)} - \avg{M(0)}^2\bigr) }{\bigl(\avg{M(0)^2} - \avg{M(0)}^2\bigr)}\,.\]
Near the critical point, and for an infinite system, this quantitiy too diverges
\[ \tau \sim \xi^z \sim \frac{1}{\lvert T-T_c \rvert^{z\nu}}\,. \]
The quantity \(z\) is called the *dynamical critical exponent* associated to the observable \(M\), and \(z=2\) for the magnetization in the 2D Ising model.

Now consider a simulation of the Ising model on a finite \(N \times N\) grid. The correlation length cannot be greater than \(N\) for this system. This means that the characteristic timescale for equilibration is about \[\tau \sim N^{z \nu} = N^2\,.\]

For an algorithm like Metropolis sampling that only flips single spins at a time, this is a disaster. Because Metropolis is a local algorithm, it can only move a spin configuration by a constant number of lattice sites for every \(N^2\) iterations. But because of the long-range correlations, we need to move about \(N\) sites before reaching a new, uncorrelated configuration. This is roughly a random walk, so it should take an *additional* \(N^2\) time steps to travel this distance. Thus, sampling independent spin configurations takes time roughly \(N^4\) for a local algorithm like Metropolis.

## Cluster flip algorithms

An obvious idea is to try to flip many spins at once. However, it isn't clear that this can be done while still fulfilling the key ingredients of a successful Monte Carlo algorithm: it should satisfy the conditions of ergodicity, the Markov property, and detailed balance. It turns out that such a method does exist, and it is due to Wolff, who built on initial ideas of Swedson and Wang.

The Wolff cluster flip algorithm is the following:

- Choose a spin at random and remember its direction.
- For each neighboring spin, if it is parallel to the seed spin, add it with probability \(p\).
- For each newly added neighbor, repeat 2 recursively until the cluster stops growing.
- Flip all the spins in the cluster.
- Repeat.

For the very special value of \(p = 1-\e^{-2J/kT}\), the Wolff algorithm satisfies the detailed balance condition with Boltzmann weights. At the level of single clusters, it is Markovian. And any spin can flip, so it is ergodic. Therefore, it has the Boltzmann probability distribution as its unique stationary distribution.

The advantage of the Wolff algorithm is that the correlation time is *much* shorter near the critical point. Because the Wolff algorithm makes nonlocal moves, it can change to a statistically independent configuration after very few moves.

## More fun with simulations

Here are some more exercises for you to explore, including some that we did with Metropolis sampling. Download the latest version of `ising.zip` and try to answer the questions below.

- Run the Wolff algorithm on a 50-by-50 lattice near the critical temperature. Notice how quickly the nonlocal moves mix the lattice configurations.
- How many moves does it take reach a typical critical configuration using Wolff as you approach the phase transition from below? Using Metropolis?
- What do you expect if you run the Wolff algorithm
*far*from the critical point? Will it still perform well? Try it on a*small*lattice size and see what happens. - Do you see any
*metastable*configurations? For example, states with low energy, but also low magnetization because they break up into magnetic domains? Why might it take a long time for such configurations to eventually reach a state of complete magnetization? - What happens if you change the sign of \(J\)? What do the ground states look like? What do the domains look like?
- Suppose you initialize the spins to be 45% up and 55% down. How often do you end up in the spin-down magnetized state when you start the simulation below the critical temperature? What thermodynamic quantity would control the probability of eventually landing in the up or down magnetized state?
- Plot the correlation function for a given spin configuration. What do the correlations look like in a (mostly) magnetized state? What about in a metastable state? In the high-temperature phase?
- Plot the radial correlation function for various temperatures.

## Perfect sampling and coupling from the past

There is another method to deal with the long autocorrelation times for certain families of critical models, including the ferromagnetic Ising model. Ideally, we would like to obtain a "perfect" sample from the stationary distribution, but this would seem to involve running our Markov chain for an infinite amount of time. A beautiful idea due to Propp and Wilson showed that we *can* in fact sample perfectly if we run our Markov chain starting in the distanct past and go *back to the future*!

This sounds fantastical at first, and maybe even nonsensical. But we can break it down into a few simple ideas, each of which is well motivated, and together the Propp-Wilson algorithm for perfect sampling emerges naturally.