In this lecture, you will implement Monte Carlo simulations in Python.
# Initialisation code for the notebook
import numpy as np
import matplotlib.pyplot as plt
import sys
sys.path.append('resource/asnlib/public/')
sys.path.append('../answer_checking/')
from validate_answers import *
In the previous lecture we have seen that Monte Carlo techniques can be used to determine the value of mathematical integrals, however, it is also widely used to simulate physical processes. In this exercise we will investigate Brownian motion.
When a macroscopic, but small, particle is suspended in a fluid it will show motion due to the collisions of molecules with the particle. When a molecule collides with the particle it transfers momentum to the particle. Since many molecules collide with the particle at arbitrary times and from arbitrary directions the particle experiences a random net momentum transferred to the particle. This random force causes the particle to follow a so-called random walk ('dronkemanswandeling') and the movement of the particle is called Brownian motion.
Brownian motion of a particle with mass $m$ and radius $r$ can be modeled by a differential equation
\begin{equation} m \frac{\text{d}^2 x}{\text{d} t^2}=-\gamma \frac{\text{d} x}{\text{d} t} + \sqrt{2 k_B T \gamma} F(t) \, , \end{equation}with $\gamma=6 \pi \eta r$ the friction coefficient, $\eta$ the viscosity of the fluid. We consider the case where $m/\gamma \ll 1$, hence, the second derivative of the position (inertial term) can be neglected. As the random force, a discrete random variable $F_i=X_i/\sqrt{\Delta t}$ is used. With the use of the finite difference methods we learned in lecture 3 we obtain the recursive equation
\begin{equation} x_i = x_{i-1}+\sqrt{2 D \Delta t} \, X_i \, , \end{equation}with the diffusion coefficient given by $D=k_B T/6 \pi \eta r$ and $\Delta t$ the discrete time step in seconds. The variable $X_i$ is a discrete random variable that is assigned the value of $\pm$1, with equal probability for up and down, i.e., with chance P$(X_i=+1)$=P$(X_i=-1)$=0.5. The expectation value of $X_i$ is calculated by averaging the outcome over all possible realizations, i.e., $\langle X_i\rangle =1/2 (1) + 1/2 (-1)$=0. The expectation value of the square of the displacement for one step is $\langle X_i^2\rangle =1/2 (1)^2 + 1/2 (-1)^2=1$. In general, we are more interested in the average total distance traveled $X$ after $N$ steps, which is found by applying the recursive equation $N$ times. This is a random variable as well
\begin{equation} X = \sqrt{2 D \Delta t} \sum_{i=1}^{N} X_i \, , \end{equation}The mean total distance traveled is $\langle X \rangle =\sqrt{2 D \Delta t} \sum_{i=1}^{N} \langle X_i \rangle $=0. So after $N$ steps we expect that on average that the particle has traveled equal amounts in the positive or negative direction. However, typically we expect the particle to have traveled a certain distance either in the positive or negative direction from its initial start position. This we calculate with $\langle X^2 \rangle = \langle 2 D \Delta t \sum_{i=1}^{N} X_i \sum_{j=1}^{N} X_j \rangle $. Since $X_i$ and $X_j$ are independent the only terms that remain are those with $i=j$ we obtain that
\begin{equation} \langle X^2 \rangle =2 D N \Delta t \, . \end{equation}The square of the distance traveled increases linearly with time. The previous description models the 1D Brownian motion. This formalism easily can be extended to 3D. Where the movement in any direction is independent and described by the 1D model the total distance travelled in 3D is $\langle r^2\rangle = \langle X^2 + Y^2 + Z^2 \rangle=6D N \Delta t$
Exercise 1
Calculate the 3D Browian motion for a glass particle with a radius of one micrometer and a density of 2.2 kg/m$^3$. Use the viscosity of water at room temperature (295 Kelvin) of 0.001 Pa s. The time step is 0.01 seconds. Perform the following steps
###
### Your code here
###
Exercise 2
Calculate the cumulative 1D displacement of every particle by summing over all values of steps in the array (first element is step 1, second element is sum of step 1 and 2, third element is sum of step 1-3, etc.). Use numpy cumsum
to perform the cumulative summation in the time direction for all particles.
Plot all the 1D particle trajectories using a for loop. Use the correct time axis (starting at $t= \Delta t$ and ending at $t=N \Delta t)$. The shape of the ensemble of trajectories somewhat resembles the left side of a horizontally oriented ellipse.
###
### Your code here
###
Exercise 3
Calculate the 3D mean square displacement $\langle r^2 \rangle$ and plot it against time and compare it to theory. This is the average over all particles of the total displacement.
Make a plot and compare it to the theoretical prediction $\langle r^2 \rangle =6Dt$. If necessary, increase the number of particles to improve your statistics.
###
### Your code here
###
Exercise 4
Plot the particle trajectories of the first 5 particles in 3D using a for loop and the given code. Here we use the mplot3d function from the mpl_toolbox library.
from mpl_toolkits import mplot3d
ax = plt.axes(projection='3d')
line=['-r', '-b', '-k', '-g', '-y']
# make a loop over the first 5 trajectories and plot them
###
### Your code here
###
For a system of non-interacting particles the average properties (e.g. diffusion coefficient as above) are already well determined by taking only the realizations for a few particles. For more complicated systems, for example systems of interacting particles, simulating their properties is not that easy.
For example, in statistical mechanics one is interested in the properties of the system at a specific temperature $T$. The state of the system is in general unknown, but it is known that the system has a Boltzmann distribution. Hence, the calculation of some system property is performed by calculating
\begin{equation} \langle X \rangle = \sum_i X_i P(E_i) \, \, \text{with} \, \, P(E_i)=\frac{\text{e}^{-\beta E_i}}{Z} \, \end{equation}with $Z$ the partition function. In general this equation could be tackled with Monte Carlo integration methods (the sum is quite similar to the integral previously encountered). We would select some states and calculate the mean value, based on this sample of states. However, for these physical systems this wouldn't work! Many of the states have an energy way above equilibrium and therefore have a very low probability. Hence, we would be sampling a lot of states that contribute nothing to the sum, and only very very few that do contribute. Moreover, the number of states is too big to sample a significant fraction (e.g. a gas with two degrees of freedom already as $2^{10^{23}}$ states.
However, there are some methods to make these simulations also for complicated and large systems. One such method is the Markov chain Monte Carlo method. In this case we do not choose the states at random, but instead pick only states that properly sample the Boltzmann distribution. But how to do this without any a priori knowledge of what these states are?
The trick is to start with a random state and then change the state according to the probability distribution of that change to occur, i.e. we require that
\begin{equation} \frac{T_{ij}}{T_{ji}} = \frac{P(E_j)}{P(E_i)} = \text{e}^{-\beta(E_j-E_i)} \end{equation}with $T_{ij}$ the transition probability of going from state $i$ to $j$. So if $E_j>E_i$ the probability decreases when the energy difference becomes bigger. A way to implement this is through the use of the Metropolis algorithm where the transition from $i4 to $j$ is called a move. This works in the following way: if the energy of the new state is lower than that of the previous state, then we accept the move. If the energy is higher we accept it with a probability given by the Boltzmann distribution. The Metropolis probability is
\begin{equation} P_a=\left\{ \begin{array}{ll}1 & \text{if} E_j \leq E_i \\\text{e}^{-\beta(E_j-E_i)} & \text{if} E_j > E_i\end{array}\right. \end{equation}When we take the above stated probability for a move, it can be shown that the transition probability obeys the requirement for $T_{ij}$. Since, we only make moves that obey the Boltzmann statistics the samples in the Markov chain that we acquire will distribution with the same probability function (and not sample many unoccupied states).
In summary, the recipe for the Markov-chain Monte Carlo method is as follows
Calculate the value of the acceptance probability $P_a$
Determine the statistical quantity of interest
The exercises below shows the implementation of the Markov-chain Monte Carlo method for the Ising model of magnetism.
The Ising model is a theoretical model of a magnet. The magnetization of a magnetic material is made up of the combination of many small magnetic dipoles spread throughout the material. If these dipoles point in random directions then the overall magnetization of the system will be close to zero, but if they line up so that all or most of them point in the same direction then the system can acquire a macroscopic magnetic moment, i.e., it becomes magnetized. The Ising model is a model of this process in which the individual moments are represented by dipoles or spins arranged on a grid or lattice, illustrated here:
In this case we are using a square lattice in two dimensions, although the model can be defined in principle for any lattice in any number of dimensions.
The spins themselves, in this simple model, are restricted to point in only two directions, up and down. Mathematically the spins are represented by variables~$s_i=\pm1$ on the points of the lattice, $+1$ for up-pointing spins and $-1$ for down-pointing ones. Dipoles in real magnets can typically point in any spatial direction, not just up or down, but the Ising model, with its restriction to just the two directions, captures a lot of the important physics while being significantly simpler to understand.
Another important feature of many magnetic materials is that the individual dipoles in the material may interact magnetically in such a way that it is energetically favorable for them to line up in the same direction. The magnetic potential energy due to the interaction of two dipoles is proportional to their dot product, but in the Ising model this simplifies to just the product $s_is_j$ for spins on sites $i$ and $j$ of the lattice, since the spins are one-dimensional scalars, not vectors. Then the actual energy of interaction is $-J_{ij} s_i s_j$, where $J_{ij}$ is a positive interaction constant. The minus sign ensures that the interactions are ferromagnetic, meaning the energy is lower when dipoles are lined up. A ferromagnetic interaction implies that the material will magnetize if given the chance. (In some materials the interaction has the opposite sign so that the dipoles prefer to be antialigned. Such a material is said to be antiferromagnetic, but we will not look at the antiferromagnetic case here.)
Normally it is assumed that spins interact only with those that are immediately adjacent to them on the lattice, which gives a total energy for the entire system equal to \begin{equation} E = -J_{ij} \sum_{<ij>} s_i s_j \,, \end{equation} where the notation $<ij>$ indicates a sum over pairs $i,j$ that are adjacent on the lattice. On the square lattice we use here each spin has four adjacent neighbors with which it interacts.
Here we write a program to perform a Markov chain Monte Carlo simulation of the Ising model on the square lattice for a system of $20\times20$ spins. This is an adaption to exercise 10.9 of Newman, Computational Physics
Exercise 5
A naive approach to solve this problem is to calculate all possible realizations of the array of spins and the mean energy of the system by taking the average. Calculate the number of possible realizations for the $20 \times 20$ Ising model. If a computer could calculate the energy of every realization in 1 microsecond, how long would the entire calculation take?
###
### Your code here
###
Exercise 6
Set up a randomly distributed spin lattice in the following way
Set up an array of size 20x20 with randomly distributed values +1 and -1. You can use a loop over all points and use random()<0.5
and if else statements to create the spin lattice. A simpler way is to use the randint
function and transform the 0 and 1 integer values to -1 and +1.
Calculate the total magnetization by taking the sum over all $\sigma_{ij}$. If you rerun your program the average magnetization should be around zero.
Make a plot of your random array with imshow
. Is it random? Are the values only + an -1?
For answer checking; use a seed 10
. Save your result to variable s
if you are using for loops; save your answer to svec
is you use a single line of code.
np.random.seed(10)
###
### Your code here
###
answer_10_06_1 = np.copy(s) # answer using for loops
answer_10_06_2 = np.copy(svec) # answer with vector implementation
# only one of the solutions have to be right
question = "answer_10_06"
num = 2
to_check = [question + "_%d" % (n+1) for n in range(num)]
feedback = ""
passed = True
for var in to_check:
res, msg = check_answer(eval(var), var)
passed = passed and res
print(msg)
assert passed == True, feedback
For calculating the spin-spin interaction we take into account only neighbors in the horizontal and vertical direction, not in the diagonal direction. So every spin can form 4 pairs. In calculating the interaction energy it is important not to count pairs twice.
To make your final program to run in a reasonable amount of time, you will find it helpful if you can work out a way to calculate the energy using Python's ability to do arithmetic with entire arrays at once. If you do the calculation step by step, your program will be significantly slower. Hence, follow the procedure below using the technique of array slicing.
Exercise 7
Make a function that calculates the energy over all horizontal and vertical pairs of spins with $J_{ij}=1$.
Note that answer checking only gives the correct result for s
the array calculated without vectorization.
# def energy(s)
###
### Your code here
###
Jij=1.0
E=energy(s)
answer_10_07_1 = np.copy(E)
question = "answer_10_07"
num = 1
to_check = [question + "_%d" % (n+1) for n in range(num)]
feedback = ""
passed = True
for var in to_check:
res, msg = check_answer(eval(var), var)
passed = passed and res
print(msg)
assert passed == True, feedback
Exercise 8
Now use your function as the basis for a Metropolis-style simulation of the Ising model with $J=1$ and temperature $T=2$ in units where the Boltzmann constant $k_B$ is also $1$. We can now do the Monte Carlo Markov chain calculation. Do this at $T=2$. Take $N$=1000 points while testing and debugging. Later increase this to $N=1 \cdot 10^{6}$.
Hint: While you are working on your program, do shorter runs, of maybe ten thousand steps at a time. Once you have it working properly, do a longer run of a million steps to get the final results.
The plotting step is done separately below.
# A handy import for printing the progress
import sys
N=100000
T=2.0
mpoints=[]
epoints=[]
for k in range(N):
mpoints.append(M)
epoints.append(E)
if ((k+1)%1000 == 0):
sys.stdout.write("\rPercentage completed: " + str(100*k/N))
sys.stdout.flush()
###
### Your code here
###
M=np.sum(s)
E=energy(s)
Plot the magnetization and energy as a function of the iteration in two separate subplots. Plot the spin distribution at the final stage of the Markov chain in a third subplot. Is the spin distribution in agreement with what you expect?
plt.figure(figsize=(12,4), dpi=100)
plt.subplot(131)
plt.plot(mpoints)
plt.ylabel('Magnetization')
plt.grid()
plt.subplot(132)
plt.plot(epoints)
plt.ylabel('Energy')
plt.grid()
plt.subplot(133)
plt.imshow(s)
plt.xlabel('X-coordinate')
plt.ylabel('Y-coordinate')
plt.colorbar()
plt.tight_layout()
Self check
Run your program several times and observe the sign of the magnetization that develops, positive or negative. Describe what you find and give a brief explanation of what is happening.
Run your program for different temperatures $T=1,2,4$ and and look at the maximum/minimum value of the final magnetization.