Nuances of stochastic sampling in statistical ensembles#
Additional Readings for the Enthusiast#
Frenkel and Smit [2], Ch. 3
Goals for Today’s Lecture#
Understand the need for non-random sampling in statistical ensemble
Describe Markov chains, importance sampling, detailed balance
Derive the acceptance criteria for the Metropolis-Hasting algorithm
Importance sampling#
One major hitch up to this point has been our probability distribution – in the area example, we directly sampled this by selecting points at random. In our determinate equations, we had it predetermined.
To evaluate ensemble averages from the canonical ensemble, we can choose
our probability density,
- importance sampling#
sampling from a distribution of states in order to maximize the information gained from each observation
First, let’s recall that the probability of finding the system in a given microstate of the
canonical ensemble,
The ensemble average then has the same form as
From this expression, we see that if we select trials according to the
probability distribution
Thus, we choose configurations in our ensemble according to the
canonical ensemble probability distribution, in which case the
ensemble-average value of
Our problem then boils down to: how do we select states according to the correct probability distribution without knowing the value of the partition function?
Markov chains#
To do so, we will generate a
- Markov chain of states#
a sequence of states that is determined one step at a time based upon the probability moving from one state to another
A Markov chain satisfies the following two conditions:
Each state generated belongs to a finite set of possible outcomes called the state space. The statistical mechanical analogue to this statement is to say that each microstate generated belongs to a finite ensemble. We can denote each possible state by
for the enormous set of possible microstates within the canonical ensemble that we are sampling. For the canonical ensemble, this state space is equal to , the accessible phase space.The probability of sampling state
in the sequence of states sampled depends only on state , and not on previous states in the chain.
Since the likelihood of sampling a new state is only related to what
current state we are in, we can define a transition probability,
If we do this an infinite number of times, then the
state
Detailed balance#
At equilibrium, the probability of finding the system in a configuration
Think of this like a mass balance in transport - the flux into a particular state is equal to the flux out of it, otherwise we would “accumulate” probability for that state. Therefore, it must be the case that in any given Markov chain the number of trials that generate new states from a particular state must be equal to the total number of trials from all other states that generate the original configuration.
If we write the probability of
transitioning from some configuration
- detailed balance#
We can rearrange this expression to write:
Writing detailed balance in this form more clearly shows the consequence
of this definition - if the probability of state
If a system obeys detailed balance, then the Markov chain must reach a stationary probability distribution which is necessarily true for any equilibrium system.
The Metropolis-Hastings algorithm#
Having defined the principle of detailed balance, we can now apply this to a particular Markov chain that samples states within the canonical ensemble by following the
- Metropolis-Hastings algorithm#
Also known as the “Metropolis algorithm”. An algorithm that samples states using acceptance criteria based upon transition probabilities.
The key aspect of the Metropolis algorithm is to further divide the
transition probability,
Show that, in the canonical ensemble,
Show derivation
Given the preceding definition, the equation for detailed balance can then be rewritten as:
Now, we can connect this expression explicitly to the canonical
ensemble. In the canonical ensemble, the probability of finding a
particular state
Next, we assume that the probability
This equation stipulates the conditions on the
acceptance condition,
which fulfills our earlier condition.. This is apparent if we consider
the case where
If the proposed state has a lower energy than our current one, what is the probability we will accept the transition move?
Click for answer
The probability is 1. All transitions in which the system energy is lowered are automatically accepted.
It is possible for the system to stay in the same state; that is:
Any given Markov chain generated using the Metropolis Monte Carlo algorithm can thus have many consecutive identical states, particularly if the system reaches a local energy minimum.
Based on this derivation, the Metropolis Monte Carlo algorithm for correctly sampling states from the canonical ensemble consists of the following steps (given in Problem Set 2):
Assume the system is in a configuration,
, which is state in a Markov chain.Generate a trial configuration
according to the probability . There can be many possible ways of generating trial configurations (or many “moves”), as noted below.Calculate the energy difference between the new and old configurations and calculate the probability of transitioning to the new state,
according to Eq. (23).Generate a uniform random number in the range [0, 1]. If the random number is less than the transition probability, then state
in the Markov chain has the new configuration and the system configuration is updated; that is, the new trial move is accepted. If the random number is greater than the transition probability, then state in the Markov chain has the old configuration ( ), the system configuration remain the same, and the move is rejected. This step is the stochastic implementation of Eq.(23).Repeat steps 2-4 until a sufficient number of states in the Markov chain are generated.
Approximate the ensemble-average value of some observable
by averaging over all states in the Markov chain. Note that many states will be repeated, and we include consecutive identical states (which appear when moves are rejected). Averaging over all states in the Markov chain then calculates .
Metropolis Monte Carlo has several advantages. First, the acceptance
ratio only relies on the energies of the two states, and as a result the
rule for generating trials moves with a probability given by
This feature also means that many-body potentials can be easily used to calculate system energies, which is more difficult to do in molecular dynamics simulations that require pair potentials. In addition, because all configurations are sampled according to their equilibrium probability distributions, averaging observables over simulation configurations generates ensemble averages. The downside, however, is that the algorithm is only useful for calculating equilibrium thermodynamic properties and not observing system kinetics.
Some care has to be taken in computing ensemble-average quantities. One choice that can affect values is the starting configuration. If an initial configuration is generated that is a high energy (and thus unlikely) state, then it may contribute to a large degree to a resulting ensemble average even if its actual Boltzmann weight is low. In other words, imagine a chain with only 10 states; the initial state will contribute to 1/10 of the ensemble-average even if it should actually contribute to a much smaller degree. As a result, it is typical to run a MC algorithm for an initial period of steps that are treated as an “equilibration” period so that any unphysical, high energy starting configuration relaxes to a set of low-energy configurations. Inspection of the MC equations indeed shows that there should generally be a decrease in the system energy until a set of configurations with similar low-energy values are obtained.