Markov Chain Monte Carlo

2024.09.11

Computationally expensive, nondeterministic iterative technique for sampling a probability distribution that cannot be easily sampled from directly. Instead a Markov Chain is constructed that has a stationary distribution equal to the desired distribution. Samples are then taken from the markov chain with the results treated as samples from our desired distribution.

High level conceptual summary

At each iteration, a transition is proposed to move the Markov Chain from state \(x\) to some state \(y\), normally by making small alterations to \(x\). The probability of applying the proposed move is calculated by a transition kernal constructed in such a way that the stationary distribution of the Markov Chain is the desired distribution.

Constructing a suitable kernel is often surprisingly easy, and is frequently done by applying Bayesian Inference. Such kernels produce the probability for advancing the chain to state \(y\) from \(x\) based on how well \(y\) fits with the prior knowledge (what properties the target configuration is expected to have) and the likelihood of \(y\) considering the actual data available.

Transitions that appear to be favorable compared to the current state of the chain have acceptance proapilities \(> 1\) and so are accepted unconditionally, whilst moves to apparently worse states will be accepted with some reduced probability.

Once the move/transition has be been either accepted - causing a state change - or rejected, the next iteration begins.

MCMC can be run for as many iterations as are required, the conventional use is to keep taking samples of the chain’s state at regular intervals after an initial burn-in period to allow the chain to reach equilibrium.

In some applications (typically those dealing with high-dimensional states, such as for image processing problems) a single sample of a chain that has reached equilibrium (converged) may be enough

Algorithms

Metropolis Hastings
Gibbs

Links to this note

Wombling