4. Analysis modules

The MDAnalysis.analysis module provides a wide collection of analysis tools for molecular dynamics trajectories. These modules build upon MDAnalysis core functionality (trajectory I/O, selections, etc.) and are designed both for reuse in research workflows and as examples of using the MDAnalysis API. Each module typically defines an analysis class that follows a standard interface.

See the User Guide Analysis section for interactive examples and additional context.

4.1. Getting started with analysis

Most analysis tools are implemented as single classes and follow this usage pattern:

  1. Import the module (e.g., MDAnalysis.analysis.rms).

  2. Initialize the analysis class with the required arguments.

  3. Run the analysis with run().

  4. Access results via the results attribute.

from MDAnalysis.analysis import ExampleAnalysisModule  # (e.g. RMSD)
analysis_obj = ExampleAnalysisModule.AnalysisClass(universe, ...)
analysis_obj.run(start=start_frame, stop=stop_frame, step=step)
print(analysis_obj.results)

Please see the individual module documentation for any specific caveats and also read and cite the reference papers associated with these algorithms.

4.1.1. Using parallelization for built-in analysis runs

Added in version 2.8.0.

AnalysisBase subclasses can run on a backend that supports parallelization (see MDAnalysis.analysis.backends). All analysis runs use backend='serial' by default, i.e., they do not use parallelization by default, which has been standard before release 2.8.0 of MDAnalysis.

Without any dependencies, only one backend is supported – built-in multiprocessing, that processes parts of a trajectory running separate processes, i.e. utilizing multi-core processors properly.

Note

For now, parallelization has only been added to MDAnalysis.analysis.rms.RMSD, but by release 3.0 version it will be introduced to all subclasses that can support it.

In order to use that feature, simply add backend='multiprocessing' to your run, and supply it with proper n_workers (use multiprocessing.cpu_count() for maximum available on your machine):

import multiprocessing
import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, DCD
from MDAnalysis.analysis.rms import RMSD
from MDAnalysis.analysis.align import AverageStructure

# initialize the universe
u = mda.Universe(PSF, DCD)

# calculate average structure for reference
avg = AverageStructure(mobile=u).run()
ref = avg.results.universe

# initialize RMSD run
rmsd = RMSD(u, ref, select='backbone')
rmsd.run(backend='multiprocessing', n_workers=multiprocessing.cpu_count())

Be explicit and specify both backend and n_workers. Choosing too many workers or using large trajectory frames may lead to an out-of-memory error.

You can also implement your own backends – see MDAnalysis.analysis.backends.

4.1.2. Additional dependencies

Some of the modules in MDAnalysis.analysis require additional Python packages to enable full functionality. For example, MDAnalysis.analysis.encore provides more options if scikit-learn is installed. If you installed MDAnalysis with pip (see Installing and using MDAnalysis) these packages are not automatically installed although one can add the [analysis] tag to the pip command to force their installation. If you installed MDAnalysis with conda then a full set of dependencies is automatically installed.

Other modules require external programs. For instance, the MDAnalysis.analysis.hole2 module requires an installation of the HOLE suite of programs. You will need to install these external dependencies by following their installation instructions before you can use the corresponding MDAnalysis module.

4.2. Building blocks for Analysis

The building block for the analysis modules is MDAnalysis.analysis.base.AnalysisBase. To build your own analysis class start by reading the documentation.

4.3. Distances and contacts

4.4. Hydrogen bonding

Deprecated modules:

4.5. Membranes and membrane proteins

4.6. Nucleic acids

4.7. Polymers

4.8. Structure

4.8.1. Macromolecules

4.8.2. Liquids

4.9. Volumetric analysis

4.10. Dimensionality Reduction

4.11. Legacy analysis modules

The MDAnalysis.analysis.legacy module contains code that for a range of reasons is not as well maintained and tested as the other analysis modules. Use with care.

4.12. Data