4.7.2.2. Mean Squared Displacement — MDAnalysis.analysis.msd
¶
Authors:  Hugo MacDermottOpeskin 

Year:  2020 
Copyright:  GNU Public License v2 
This module implements the calculation of Mean Squared Displacements (MSDs) by the Einstein relation. MSDs can be used to characterize the speed at which particles move and has its roots in the study of Brownian motion. For a full explanation of the theory behind MSDs and the subsequent calculation of selfdiffusivities the reader is directed to [Maginn2019]. MSDs can be computed from the following expression, known as the Einstein formula:
where \(N\) is the number of equivalent particles the MSD is calculated over, \(r\) are their coordinates and \(d\) the desired dimensionality of the MSD. Note that while the definition of the MSD is universal, there are many practical considerations to computing the MSD that vary between implementations. In this module, we compute a “windowed” MSD, where the MSD is averaged over all possible lagtimes \(\tau \le \tau_{max}\), where \(\tau_{max}\) is the length of the trajectory, thereby maximizing the number of samples.
The computation of the MSD in this way can be computationally intensive due to
its \(N^2\) scaling with respect to \(\tau_{max}\). An algorithm to
compute the MSD with \(N log(N)\) scaling based on a Fast Fourier
Transform is known and can be accessed by setting fft=True
[Calandri2011]
[Buyl2018]. The FFTbased approach requires that the
tidynamics package is
installed; otherwise the code will raise an ImportError
.
Please cite [Calandri2011] [Buyl2018] if you use this module in addition to the normal MDAnalysis citations.
4.7.2.2.1. Computing an MSD¶
This example computes a 3D MSD for the movement of 100 particles undergoing a
random walk. Files provided as part of the MDAnalysis test suite are used
(in the variables RANDOM_WALK
and
RANDOM_WALK_TOPO
)
First load all modules and test data
import MDAnalysis as mda
import MDAnalysis.analysis.msd as msd
from MDAnalysis.tests.datafiles import RANDOM_WALK, RANDOM_WALK_TOPO
Given a universe containing trajectory data we can extract the MSD
analysis by using the class EinsteinMSD
u = mda.Universe(RANDOM_WALK, RANDOM_WALK_TOPO)
MSD = msd.EinsteinMSD(u, select='all', msd_type='xyz', fft=True)
MSD.run()
The MSD can then be accessed as
msd = MSD.timeseries
 Visual inspection of the MSD is important, so let’s take a look at it with a
 simple plot.
import matplotlib.pyplot as plt
nframes = MSD.n_frames
timestep = 1 # this needs to be the actual time between frames
lagtimes = np.arange(nframes)*timestep # make the lagtime axis
fig = plt.figure()
ax = plt.axes()
# plot the actual MSD
ax.plot(lagtimes, msd, lc="black", ls="", label=r'3D random walk')
exact = lagtimes*6
# plot the exact result
ax.plot(lagtimes, exact, lc="black", ls="", label=r'$y=2 D\tau$')
plt.show()
This gives us the plot of the MSD with respect to lagtime (\(\tau\)). We can see that the MSD is approximately linear with respect to \(\tau\). This is a numerical example of a known theoretical result that the MSD of a random walk is linear with respect to lagtime, with a slope of \(2d\). In this expression \(d\) is the dimensionality of the MSD. For our 3D MSD, this is 3. For comparison we have plotted the line \(y=6\tau\) to which an ensemble of 3D random walks should converge.
Note that a segment of the MSD is required to be linear to accurately determine selfdiffusivity. This linear segment represents the so called “middle” of the MSD plot, where ballistic trajectories at short timelags are excluded along with poorly averaged data at long timelags. We can select the “middle” of the MSD by indexing the MSD and the timelags. Appropriately linear segments of the MSD can be confirmed with a loglog plot as is often reccomended [Maginn2019] where the “middle” segment can be identified as having a slope of 1.
plt.loglog(lagtimes, msd)
plt.show()
Now that we have identified what segment of our MSD to analyse, let’s compute a selfdiffusivity.
4.7.2.2.2. Computing SelfDiffusivity¶
Selfdiffusivity is closely related to the MSD.
From the MSD, selfdiffusivities \(D\) with the desired dimensionality \(d\) can be computed by fitting the MSD with respect to the lagtime to a linear model. An example of this is shown below, using the MSD computed in the example above. The segment between \(\tau = 20\) and \(\tau = 60\) is used to demonstrate selection of a MSD segment.
from scipy.stats import linregress
start_time = 20
start_index = int(start_time/timestep)
end_time = 60
linear_model = linregress(lagtimes[start_index:end_index],
msd[start_index:end_index])
slope = linear_model.slope
error = linear_model.rvalue
# dim_fac is 3 as we computed a 3D msd with 'xyz'
D = slope * 1/(2*MSD.dim_fac)
We have now computed a selfdiffusivity!
Notes
There are several factors that must be taken into account when setting up and processing trajectories for computation of selfdiffusivities. These include specific instructions around simulation settings, using unwrapped trajectories and maintaining a relatively small elapsed time between saved frames. Additionally, corrections for finite size effects are sometimes employed along with various means of estimating errors [Yeh2004] [Bulow2020]. The reader is directed to the following review, which describes many of the common pitfalls [Maginn2019]. There are other ways to compute selfdiffusivity, such as from a GreenKubo integral. At this point in time, these methods are beyond the scope of this module.
Note also that computation of MSDs is highly memory intensive. If this is
proving a problem, judicious use of the start
, stop
, step
keywords to control which frames are incorporated may be required.
References
[Maginn2019]  (1, 2, 3) Maginn, E. J., Messerly, R. A., Carlson, D. J.; Roe, D. R., Elliott, J. R. Best Practices for Computing Transport Properties 1. SelfDiffusivity and Viscosity from Equilibrium Molecular Dynamics [Article v1.0]. Living J. Comput. Mol. Sci. 2019, 1 (1). 
[Yeh2004]  Yeh, I. C.; Hummer, G. SystemSize Dependence of Diffusion Coefficients and Viscosities from Molecular Dynamics Simulations with Periodic Boundary Conditions. J. Phys. Chem. B 2004, 108 (40), 15873–15879. 
[Bulow2020]  von Bülow, S.; Bullerjahn, J. T.; Hummer, G. Systematic Errors in Diffusion Coefficients from LongTime Molecular Dynamics Simulations at Constant Pressure. 2020. arXiv:2003.09205 [CondMat, Physics:Physics]. 
4.7.2.2.3. Classes¶

class
MDAnalysis.analysis.msd.
EinsteinMSD
(u, select='all', msd_type='xyz', fft=True, **kwargs)[source]¶ Class to calculate Mean Squared Displacement by the Einstein relation.
Parameters:  u (Universe or AtomGroup) – An MDAnalysis
Universe
orAtomGroup
. Note thatUpdatingAtomGroup
instances are not accepted.  select (str) – A selection string. Defaults to “all” in which case all atoms are selected.
 msd_type ({'xyz', 'xy', 'yz', 'xz', 'x', 'y', 'z'}) – Desired dimensions to be included in the MSD. Defaults to ‘xyz’.
 fft (bool) – If
True
, uses a fast FFT based algorithm for computation of the MSD. Otherwise, use the simple “windowed” algorithm. The tidynamics package is required for fft=True. Defaults toTrue
.

timeseries
¶ The averaged MSD over all the particles with respect to lagtime.
Type: numpy.ndarray

msds_by_particle
¶ The MSD of each individual particle with respect to lagtime.
Type: numpy.ndarray

ag
¶ The
AtomGroup
resulting from your selectionType: AtomGroup
Parameters:  u (Universe or AtomGroup) – An MDAnalysis
Universe
orAtomGroup
.  select (str) – A selection string. Defaults to “all” in which case all atoms are selected.
 msd_type ({'xyz', 'xy', 'yz', 'xz', 'x', 'y', 'z'}) – Desired dimensions to be included in the MSD.
 fft (bool) – If
True
, uses a fast FFT based algorithm for computation of the MSD. Otherwise, use the simple “windowed” algorithm. The tidynamics package is required for fft=True.

run
(start=None, stop=None, step=None, verbose=None)¶ Perform the calculation
Parameters:
 u (Universe or AtomGroup) – An MDAnalysis