4.3.8. Bond-Angle-Torsion coordinates analysis — MDAnalysis.analysis.bat
- Author:
Soohaeng Yoo Willow and David Minh
- Year:
2020
- Copyright:
GNU Public License, v2 or any higher version
Added in version 2.0.0.
This module contains classes for interconverting between Cartesian and an internal coordinate system, Bond-Angle-Torsion (BAT) coordinates [1], for a given set of atoms or residues. This coordinate system is designed to be complete, non-redundant, and minimize correlations between degrees of freedom. Complete and non-redundant means that for N atoms there will be 3N Cartesian coordinates and 3N BAT coordinates. Correlations are minimized by using improper torsions, as described in [2].
More specifically, bond refers to the bond length, or distance between a pair of bonded atoms. Angle refers to the bond angle, the angle between a pair of bonds to a central atom. Torsion refers to the torsion angle. For a set of four atoms a, b, c, and d, a torsion requires bonds between a and b, b and c, and c and d. The torsion is the angle between a plane containing atoms a, b, and c and another plane containing b, c, and d. For a set of torsions that share atoms b and c, one torsion is defined as the primary torsion. The others are defined as improper torsions, differences between the raw torsion angle and the primary torsion. This definition reduces the correlation between the torsion angles.
Each molecule also has six external coordinates that define its translation and rotation in space. The three Cartesian coordinates of the first atom are the molecule’s translational degrees of freedom. Rotational degrees of freedom are specified by the axis-angle convention. The rotation axis is a normalized vector pointing from the first to second atom. It is described by the polar angle, \(\phi\), and azimuthal angle, \(\theta\). \(\omega\) is a third angle that describes the rotation of the third atom about the axis.
This module was adapted from AlGDock [3].
See also
Dihedral
class to calculate dihedral angles for a given set of atoms or residues
MDAnalysis.lib.distances.calc_dihedrals()
function to calculate dihedral angles from atom positions
4.3.8.1. Example applications
The BAT
class defines bond-angle-torsion
coordinates based on the topology of an atom group and interconverts between
Cartesian and BAT coordinate systems.
For example, we can determine internal coordinates for residues 5-10 of adenylate kinase (AdK). The trajectory is included within the test data files:
import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, DCD
import numpy as np
u = mda.Universe(PSF, DCD)
# selection of atomgroups
selected_residues = u.select_atoms("resid 5-10")
from MDAnalysis.analysis.bat import BAT
R = BAT(selected_residues)
# Calculate BAT coordinates for a trajectory
R.run()
After R.run()
, the coordinates can be accessed with
R.results.bat
. The following code snippets assume that the
previous snippet has been executed.
Reconstruct Cartesian coordinates for the first frame:
# Reconstruct Cartesian coordinates from BAT coordinates
# of the first frame
XYZ = R.Cartesian(R.results.bat[0,:])
# The original and reconstructed Cartesian coordinates should all be close
print(np.allclose(XYZ, selected_residues.positions, atol=1e-6))
Change a single torsion angle by \(\pi\):
bat = R.results.bat[0,:]
bat[bat.shape[0]-12] += np.pi
XYZ = R.Cartesian(bat)
# A good number of Cartesian coordinates should have been modified
np.sum((XYZ - selected_residues.positions)>1E-5)
Store data to the disk and load it again:
# BAT coordinates can be saved to disk in the numpy binary format
R.save('test.npy')
# The BAT coordinates in a new BAT instance can be loaded from disk
# instead of using the run() method.
Rnew = BAT(selected_residues, filename='test.npy')
# The BAT coordinates before and after disk I/O should be close
print(np.allclose(Rnew.results.bat, R.results.bat))
4.3.8.2. Analysis classes
- class MDAnalysis.analysis.bat.BAT(ag, initial_atom=None, filename=None, **kwargs)[source]
Calculate BAT coordinates for the specified AtomGroup.
Bond-Angle-Torsions (BAT) internal coordinates will be computed for the group of atoms and all frame in the trajectory belonging to ag.
Changed in version 2.8.0: Enabled parallel execution with the
multiprocessing
anddask
backends; use the new methodget_supported_backends()
to see all supported backends.- Parameters:
ag (AtomGroup or Universe) – Group of atoms for which the BAT coordinates are calculated. ag must have a bonds attribute. If unavailable, bonds may be guessed using
AtomGroup.guess_bonds
. ag must only include one molecule. If a trajectory is associated with the atoms, then the computation iterates over the trajectory.initial_atom (
Atom
) – The atom whose Cartesian coordinates define the translation of the molecule. If not specified, the heaviest terminal atom will be selected.filename (str) – Name of a numpy binary file containing a saved bat array. If filename is not
None
, the data will be loaded from this file instead of being recalculated using the run() method.
- Raises:
AttributeError – If ag does not contain a bonds attribute
ValueError – If ag contains more than one molecule
- results.bat
Contains the time series of the Bond-Angle-Torsion coordinates as a (nframes, 3N)
numpy.ndarray
array. Each row corresponds to a frame in the trajectory. In each column, the first six elements describe external degrees of freedom. The first three are the center of mass of the initial atom. The next three specify the external angles according to the axis-angle convention: \(\phi\), the polar angle, \(\theta\), the azimuthal angle, and \(\omega\), a third angle that describes the rotation of the third atom about the axis. The next three degrees of freedom are internal degrees of freedom for the root atoms: \(r_{01}\), the distance between atoms 0 and 1, \(r_{12}\), the distance between atoms 1 and 2, and \(a_{012}\), the angle between the three atoms. The rest of the array consists of all the other bond distances, all the other bond angles, and then all the other torsion angles.
- Cartesian(bat_frame)[source]
Conversion of a single frame from BAT to Cartesian coordinates
One application of this function is to determine the new Cartesian coordinates after modifying a specific torsion angle.
- Parameters:
bat_frame (numpy.ndarray) – an array with dimensions (3N,) with external then internal degrees of freedom based on the root atoms, followed by the bond, angle, and (proper and improper) torsion coordinates.
- Returns:
XYZ – an array with dimensions (N,3) with Cartesian coordinates. The first dimension has the same ordering as the AtomGroup used to initialize the class. The molecule will be whole opposed to wrapped around a periodic boundary.
- Return type:
- property atoms
The atomgroup for which BAT are computed (read-only property)
- classmethod get_supported_backends()[source]
Tuple with backends supported by the core library for a given class. User can pass either one of these values as
backend=...
torun()
method, or a custom object that hasapply
method (see documentation forrun()
):‘serial’: no parallelization
‘multiprocessing’: parallelization using multiprocessing.Pool
‘dask’: parallelization using dask.delayed.compute(). Requires installation of mdanalysis[dask]
If you want to add your own backend to an existing class, pass a
backends.BackendBase
subclass (see its documentation to learn how to implement it properly), and specifyunsupported_backend=True
.- Returns:
names of built-in backends that can be used in
run(backend=...)()
- Return type:
Added in version 2.8.0.
- load(filename, start=None, stop=None, step=None)[source]
Loads the bat trajectory from a file in numpy binary format
- Parameters:
See also
save
Saves the bat trajectory in a file in numpy binary format
- property parallelizable
Boolean mark showing that a given class can be parallelizable with split-apply-combine procedure. Namely, if we can safely distribute
_single_frame()
to multiple workers and then combine them with a proper_conclude()
call. If set toFalse
, no backends except forserial
are supported.Note
If you want to check parallelizability of the whole class, without explicitly creating an instance of the class, see
_analysis_algorithm_is_parallelizable
. Note that you setting it to other value will break things if the algorithm behind the analysis is not trivially parallelizable.- Returns:
if a given
AnalysisBase
subclass instance is parallelizable with split-apply-combine, or not- Return type:
Added in version 2.8.0.
- run(start: int | None = None, stop: int | None = None, step: int | None = None, frames: Iterable | None = None, verbose: bool | None = None, n_workers: int | None = None, n_parts: int | None = None, backend: str | BackendBase | None = None, *, unsupported_backend: bool = False, progressbar_kwargs=None)
Perform the calculation
- Parameters:
start (int, optional) – start frame of analysis
stop (int, optional) – stop frame of analysis
step (int, optional) – number of frames to skip between each analysed frame
frames (array_like, optional) –
array of integers or booleans to slice trajectory;
frames
can only be used instead ofstart
,stop
, andstep
. Setting bothframes
and at least one ofstart
,stop
,step
to a non-default value will raise aValueError
.Added in version 2.2.0.
verbose (bool, optional) – Turn on verbosity
progressbar_kwargs (dict, optional) – ProgressBar keywords with custom parameters regarding progress bar position, etc; see
MDAnalysis.lib.log.ProgressBar
for full list. Available only forbackend='serial'
backend (Union[str, BackendBase], optional) –
By default, performs calculations in a serial fashion. Otherwise, user can choose a backend:
str
is matched to a builtin backend (one ofserial
,multiprocessing
anddask
), or aMDAnalysis.analysis.results.BackendBase
subclass.Added in version 2.8.0.
n_workers (int) –
positive integer with number of workers (processes, in case of built-in backends) to split the work between
Added in version 2.8.0.
n_parts (int, optional) –
number of parts to split computations across. Can be more than number of workers.
Added in version 2.8.0.
unsupported_backend (bool, optional) –
if you want to run your custom backend on a parallelizable class that has not been tested by developers, by default False
Added in version 2.8.0.
Changed in version 2.2.0: Added ability to analyze arbitrary frames by passing a list of frame indices in the frames keyword argument.
Changed in version 2.5.0: Add progressbar_kwargs parameter, allowing to modify description, position etc of tqdm progressbars
Changed in version 2.8.0: Introduced
backend
,n_workers
,n_parts
andunsupported_backend
keywords, and refactored the method logic to support parallelizable execution.
References