4.3.8. Bond-Angle-Torsion coordinates analysis — MDAnalysis.analysis.bat

Author:

Soohaeng Yoo Willow and David Minh

Year:

2020

Copyright:

Lesser GNU Public License, v2.1 or any higher version

Added in version 2.0.0.

This module contains classes for interconverting between Cartesian and an internal coordinate system, Bond-Angle-Torsion (BAT) coordinates [1], for a given set of atoms or residues. This coordinate system is designed to be complete, non-redundant, and minimize correlations between degrees of freedom. Complete and non-redundant means that for N atoms there will be 3N Cartesian coordinates and 3N BAT coordinates. Correlations are minimized by using improper torsions, as described in [2].

More specifically, bond refers to the bond length, or distance between a pair of bonded atoms. Angle refers to the bond angle, the angle between a pair of bonds to a central atom. Torsion refers to the torsion angle. For a set of four atoms a, b, c, and d, a torsion requires bonds between a and b, b and c, and c and d. The torsion is the angle between a plane containing atoms a, b, and c and another plane containing b, c, and d. For a set of torsions that share atoms b and c, one torsion is defined as the primary torsion. The others are defined as improper torsions, differences between the raw torsion angle and the primary torsion. This definition reduces the correlation between the torsion angles.

Each molecule also has six external coordinates that define its translation and rotation in space. The three Cartesian coordinates of the first atom are the molecule’s translational degrees of freedom. Rotational degrees of freedom are specified by the axis-angle convention. The rotation axis is a normalized vector pointing from the first to second atom. It is described by the polar angle, \(\phi\), and azimuthal angle, \(\theta\). \(\omega\) is a third angle that describes the rotation of the third atom about the axis.

This module was adapted from AlGDock [3].

See also

Dihedral

class to calculate dihedral angles for a given set of atoms or residues

MDAnalysis.lib.distances.calc_dihedrals()

function to calculate dihedral angles from atom positions

4.3.8.1. Example applications

The BAT class defines bond-angle-torsion coordinates based on the topology of an atom group and interconverts between Cartesian and BAT coordinate systems.

For example, we can determine internal coordinates for residues 5-10 of adenylate kinase (AdK). The trajectory is included within the test data files:

import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, DCD
import numpy as np

u = mda.Universe(PSF, DCD)

# selection of atomgroups
selected_residues = u.select_atoms("resid 5-10")

from MDAnalysis.analysis.bat import BAT
R = BAT(selected_residues)

# Calculate BAT coordinates for a trajectory
R.run()

After R.run(), the coordinates can be accessed with R.results.bat. The following code snippets assume that the previous snippet has been executed.

Reconstruct Cartesian coordinates for the first frame:

# Reconstruct Cartesian coordinates from BAT coordinates
# of the first frame
XYZ = R.Cartesian(R.results.bat[0,:])

# The original and reconstructed Cartesian coordinates should all be close
print(np.allclose(XYZ, selected_residues.positions, atol=1e-6))

Change a single torsion angle by \(\pi\):

bat = R.results.bat[0,:]
bat[bat.shape[0]-12] += np.pi
XYZ = R.Cartesian(bat)

# A good number of Cartesian coordinates should have been modified
np.sum((XYZ - selected_residues.positions)>1E-5)

Store data to the disk and load it again:

# BAT coordinates can be saved to disk in the numpy binary format
R.save('test.npy')

# The BAT coordinates in a new BAT instance can be loaded from disk
# instead of using the run() method.
Rnew = BAT(selected_residues, filename='test.npy')

# The BAT coordinates before and after disk I/O should be close
print(np.allclose(Rnew.results.bat, R.results.bat))

4.3.8.2. Analysis classes

class MDAnalysis.analysis.bat.BAT(ag, initial_atom=None, filename=None, **kwargs)[source]

Calculate BAT coordinates for the specified AtomGroup.

Bond-Angle-Torsions (BAT) internal coordinates will be computed for the group of atoms and all frame in the trajectory belonging to ag.

Changed in version 2.8.0: Enabled parallel execution with the multiprocessing and dask backends; use the new method get_supported_backends() to see all supported backends.

Parameters:
  • ag (AtomGroup or Universe) – Group of atoms for which the BAT coordinates are calculated. ag must have a bonds attribute. If unavailable, bonds may be guessed using AtomGroup.guess_bonds. ag must only include one molecule. If a trajectory is associated with the atoms, then the computation iterates over the trajectory.

  • initial_atom (Atom) – The atom whose Cartesian coordinates define the translation of the molecule. If not specified, the heaviest terminal atom will be selected.

  • filename (str) – Name of a numpy binary file containing a saved bat array. If filename is not None, the data will be loaded from this file instead of being recalculated using the run() method.

Raises:
results.bat

Contains the time series of the Bond-Angle-Torsion coordinates as a (nframes, 3N) numpy.ndarray array. Each row corresponds to a frame in the trajectory. In each column, the first six elements describe external degrees of freedom. The first three are the center of mass of the initial atom. The next three specify the external angles according to the axis-angle convention: \(\phi\), the polar angle, \(\theta\), the azimuthal angle, and \(\omega\), a third angle that describes the rotation of the third atom about the axis. The next three degrees of freedom are internal degrees of freedom for the root atoms: \(r_{01}\), the distance between atoms 0 and 1, \(r_{12}\), the distance between atoms 1 and 2, and \(a_{012}\), the angle between the three atoms. The rest of the array consists of all the other bond distances, all the other bond angles, and then all the other torsion angles.

Cartesian(bat_frame)[source]

Conversion of a single frame from BAT to Cartesian coordinates

One application of this function is to determine the new Cartesian coordinates after modifying a specific torsion angle.

Parameters:

bat_frame (numpy.ndarray) – an array with dimensions (3N,) with external then internal degrees of freedom based on the root atoms, followed by the bond, angle, and (proper and improper) torsion coordinates.

Returns:

XYZ – an array with dimensions (N,3) with Cartesian coordinates. The first dimension has the same ordering as the AtomGroup used to initialize the class. The molecule will be whole opposed to wrapped around a periodic boundary.

Return type:

numpy.ndarray

property atoms

The atomgroup for which BAT are computed (read-only property)

classmethod get_supported_backends()[source]

Tuple with backends supported by the core library for a given class. User can pass either one of these values as backend=... to run() method, or a custom object that has apply method (see documentation for run()):

  • ‘serial’: no parallelization

  • ‘multiprocessing’: parallelization using multiprocessing.Pool

  • ‘dask’: parallelization using dask.delayed.compute(). Requires installation of mdanalysis[dask]

If you want to add your own backend to an existing class, pass a backends.BackendBase subclass (see its documentation to learn how to implement it properly), and specify unsupported_backend=True.

Returns:

names of built-in backends that can be used in run(backend=...)()

Return type:

tuple

Added in version 2.8.0.

load(filename, start=None, stop=None, step=None)[source]

Loads the bat trajectory from a file in numpy binary format

Parameters:
  • filename (str) – name of numpy binary file

  • start (int, optional) – start frame of analysis

  • stop (int, optional) – stop frame of analysis

  • step (int, optional) – number of frames to skip between each analysed frame

See also

save

Saves the bat trajectory in a file in numpy binary format

property parallelizable

Boolean mark showing that a given class can be parallelizable with split-apply-combine procedure. Namely, if we can safely distribute _single_frame() to multiple workers and then combine them with a proper _conclude() call. If set to False, no backends except for serial are supported.

Note

If you want to check parallelizability of the whole class, without explicitly creating an instance of the class, see _analysis_algorithm_is_parallelizable. Note that you setting it to other value will break things if the algorithm behind the analysis is not trivially parallelizable.

Returns:

if a given AnalysisBase subclass instance is parallelizable with split-apply-combine, or not

Return type:

bool

Added in version 2.8.0.

run(start: int | None = None, stop: int | None = None, step: int | None = None, frames: Iterable | None = None, verbose: bool | None = None, n_workers: int | None = None, n_parts: int | None = None, backend: str | BackendBase | None = None, *, unsupported_backend: bool = False, progressbar_kwargs=None)

Perform the calculation

Parameters:
  • start (int, optional) – start frame of analysis

  • stop (int, optional) – stop frame of analysis

  • step (int, optional) – number of frames to skip between each analysed frame

  • frames (array_like, optional) –

    array of integers or booleans to slice trajectory; frames can only be used instead of start, stop, and step. Setting both frames and at least one of start, stop, step to a non-default value will raise a ValueError.

    Added in version 2.2.0.

  • verbose (bool, optional) – Turn on verbosity

  • progressbar_kwargs (dict, optional) – ProgressBar keywords with custom parameters regarding progress bar position, etc; see MDAnalysis.lib.log.ProgressBar for full list. Available only for backend='serial'

  • backend (Union[str, BackendBase], optional) –

    By default, performs calculations in a serial fashion. Otherwise, user can choose a backend: str is matched to a builtin backend (one of serial, multiprocessing and dask), or a MDAnalysis.analysis.results.BackendBase subclass.

    Added in version 2.8.0.

  • n_workers (int) –

    positive integer with number of workers (processes, in case of built-in backends) to split the work between

    Added in version 2.8.0.

  • n_parts (int, optional) –

    number of parts to split computations across. Can be more than number of workers.

    Added in version 2.8.0.

  • unsupported_backend (bool, optional) –

    if you want to run your custom backend on a parallelizable class that has not been tested by developers, by default False

    Added in version 2.8.0.

Changed in version 2.2.0: Added ability to analyze arbitrary frames by passing a list of frame indices in the frames keyword argument.

Changed in version 2.5.0: Add progressbar_kwargs parameter, allowing to modify description, position etc of tqdm progressbars

Changed in version 2.8.0: Introduced backend, n_workers, n_parts and unsupported_backend keywords, and refactored the method logic to support parallelizable execution.

save(filename)[source]

Saves the bat trajectory in a file in numpy binary format

See also

load

Loads the bat trajectory from a file in numpy binary format

References