13.2.1. Fast distance array computation — MDAnalysis.lib.distances
¶
Fast C-routines to calculate arrays of distances or angles from coordinate arrays. Many of the functions also exist in parallel versions, which typically provide higher performance than the serial code. The boolean attribute MDAnalysis.lib.distances.USED_OPENMP can be checked to see if OpenMP was used in the compilation of MDAnalysis.
13.2.1.1. Selection of acceleration (“backend”)¶
All functions take the optional keyword backend, which determines the type of acceleration. Currently, the following choices are implemented (backend is case-insensitive):
backend | module | description |
---|---|---|
“serial” | c_distances |
serial implementation in C/Cython |
“OpenMP” | c_distances_openmp |
parallel implementation in C/Cython with OpenMP |
New in version 0.13.0.
13.2.1.2. Functions¶
-
MDAnalysis.lib.distances.
distance_array
(reference, configuration, box=None, result=None, backend='serial')[source]¶ Calculate all possible distances between a reference set and another configuration.
If there are
n
positions in reference andm
positions in configuration, a distance array of shape(n, m)
will be computed.If the optional argument box is supplied, the minimum image convention is applied when calculating distances. Either orthogonal or triclinic boxes are supported.
If a 2D numpy array of dtype
numpy.float64
with the shape(n, m)
is provided in result, then this preallocated array is filled. This can speed up calculations.Parameters: - reference (numpy.ndarray) – Reference coordinate array of shape
(3,)
or(n, 3)
(dtype is arbitrary, will be converted tonumpy.float32
internally). - configuration (numpy.ndarray) – Configuration coordinate array of shape
(3,)
or(m, 3)
(dtype is arbitrary, will be converted tonumpy.float32
internally). - box (array_like, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - result (numpy.ndarray, optional) – Preallocated result array which must have the shape
(n, m)
and dtypenumpy.float64
. Avoids creating the array which saves time when the function is called repeatedly. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: d – Array containing the distances
d[i,j]
between reference coordinatesi
and configuration coordinatesj
.Return type: numpy.ndarray (
dtype=numpy.float64
,shape=(n, m)
)Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts single coordinates as input.- reference (numpy.ndarray) – Reference coordinate array of shape
-
MDAnalysis.lib.distances.
self_distance_array
(reference, box=None, result=None, backend='serial')[source]¶ Calculate all possible distances within a configuration reference.
If the optional argument box is supplied, the minimum image convention is applied when calculating distances. Either orthogonal or triclinic boxes are supported.
If a 1D numpy array of dtype
numpy.float64
with the shape(n*(n-1)/2,)
is provided in result, then this preallocated array is filled. This can speed up calculations.Parameters: - reference (numpy.ndarray) – Reference coordinate array of shape
(3,)
or(n, 3)
(dtype is arbitrary, will be converted tonumpy.float32
internally). - box (array_like, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - result (numpy.ndarray, optional) – Preallocated result array which must have the shape
(n*(n-1)/2,)
and dtypenumpy.float64
. Avoids creating the array which saves time when the function is called repeatedly. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: d – Array containing the distances
dist[i,j]
between reference coordinatesi
andj
at positiond[k]
. Loop throughd
:for i in range(n): for j in range(i + 1, n): k += 1 dist[i, j] = d[k]
Return type: numpy.ndarray (
dtype=numpy.float64
,shape=(n*(n-1)/2,)
)Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
.- reference (numpy.ndarray) – Reference coordinate array of shape
-
MDAnalysis.lib.distances.
capped_distance
(reference, configuration, max_cutoff, min_cutoff=None, box=None, method=None, return_distances=True)[source]¶ Calculates pairs of indices corresponding to entries in the reference and configuration arrays which are separated by a distance lying within the specified cutoff(s). Optionally, these distances can be returned as well.
If the optional argument box is supplied, the minimum image convention is applied when calculating distances. Either orthogonal or triclinic boxes are supported.
An automatic guessing of the optimal method to calculate the distances is included in the function. An optional keyword for the method is also provided. Users can enforce a particular method with this functionality. Currently brute force and periodic KDtree methods are implemented.
Parameters: - reference (numpy.ndarray) – Reference coordinate array with shape
(3,)
or(n, 3)
. - configuration (numpy.ndarray) – Configuration coordinate array with shape
(3,)
or(m, 3)
. - max_cutoff (float) – Maximum cutoff distance between the reference and configuration.
- min_cutoff (float, optional) – Minimum cutoff distance between reference and configuration.
- box (array_like, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - method ({'bruteforce', 'pkdtree'}, optional) – Keyword to override the automatic guessing of the employed search method.
- return_distances (bool, optional) – If set to
True
, distances will also be returned.
Returns: pairs (numpy.ndarray (
dtype=numpy.int64
,shape=(n_pairs, 2)
)) – Pairs of indices, corresponding to coordinates in the reference and configuration arrays such that the distance between them lies within the interval (min_cutoff, max_cutoff]. Each row in pairs is an index pair[i, j]
corresponding to thei
-th coordinate in reference and thej
-th coordinate in configuration.distances (numpy.ndarray (
dtype=numpy.float64
,shape=(n_pairs,)
), optional) – Distances corresponding to each pair of indices. Only returned if return_distances isTrue
.distances[k]
corresponds to thek
-th pair returned in pairs and gives the distance between the coordinatesreference[pairs[k, 0]]
andconfiguration[pairs[k, 1]]
.pairs, distances = capped_distances(reference, configuration, max_cutoff, return_distances=True) for k, [i, j] in enumerate(pairs): coord1 = reference[i] coord2 = configuration[j] distance = distances[k]
Changed in version 1.0.1: nsgrid was temporarily removed and replaced with pkdtree due to issues relating to its reliability and accuracy (Issues #2919, #2229, #2345, #2670, #2930)
Changed in version 1.0.2: nsgrid enabled again
- reference (numpy.ndarray) – Reference coordinate array with shape
-
MDAnalysis.lib.distances.
self_capped_distance
(reference, max_cutoff, min_cutoff=None, box=None, method=None, return_distances=True)[source]¶ Calculates pairs of indices corresponding to entries in the reference array which are separated by a distance lying within the specified cutoff(s). Optionally, these distances can be returned as well.
If the optional argument box is supplied, the minimum image convention is applied when calculating distances. Either orthogonal or triclinic boxes are supported.
An automatic guessing of the optimal method to calculate the distances is included in the function. An optional keyword for the method is also provided. Users can enforce a particular method with this functionality. Currently brute force, and periodic KDtree methods are implemented.
Parameters: - reference (numpy.ndarray) – Reference coordinate array with shape
(3,)
or(n, 3)
. - max_cutoff (float) – Maximum cutoff distance between reference coordinates.
- min_cutoff (float, optional) – Minimum cutoff distance between reference coordinates.
- box (array_like, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - method ({'bruteforce', 'pkdtree'}, optional) – Keyword to override the automatic guessing of the employed search method.
- return_distances (bool, optional) – If set to
True
, distances will also be returned.
Returns: pairs (numpy.ndarray (
dtype=numpy.int64
,shape=(n_pairs, 2)
)) – Pairs of indices, corresponding to coordinates in the reference array such that the distance between them lies within the interval (min_cutoff, max_cutoff]. Each row in pairs is an index pair[i, j]
corresponding to thei
-th and thej
-th coordinate in reference.distances (numpy.ndarray (
dtype=numpy.float64
,shape=(n_pairs,)
)) – Distances corresponding to each pair of indices. Only returned if return_distances isTrue
.distances[k]
corresponds to thek
-th pair returned in pairs and gives the distance between the coordinatesreference[pairs[k, 0]]
andreference[pairs[k, 1]]
.pairs, distances = self_capped_distances(reference, max_cutoff, return_distances=True) for k, [i, j] in enumerate(pairs): coord1 = reference[i] coord2 = reference[j] distance = distances[k]
Note
Currently supports brute force, and periodic KDtree search methods.
Changed in version 0.20.0: Added return_distances keyword.
Changed in version 1.0.1: nsgrid was temporarily removed and replaced with pkdtree due to issues relating to its reliability and accuracy (Issues #2919, #2229, #2345, #2670, #2930)
Changed in version 1.0.2: enabled nsgrid again
- reference (numpy.ndarray) – Reference coordinate array with shape
-
MDAnalysis.lib.distances.
calc_bonds
(coords1, coords2, box=None, result=None, backend='serial')[source]¶ Calculates the bond lengths between pairs of atom positions from the two coordinate arrays coords1 and coords2, which must contain the same number of coordinates.
coords1[i]
andcoords2[i]
represent the positions of atoms connected by thei
-th bond. If single coordinates are supplied, a single distance will be returned.In comparison to
distance_array()
andself_distance_array()
, which calculate distances between all possible combinations of coordinates,calc_bonds()
only calculates distances between pairs of coordinates, similar to:numpy.linalg.norm(a - b) for a, b in zip(coords1, coords2)
If the optional argument box is supplied, the minimum image convention is applied when calculating distances. Either orthogonal or triclinic boxes are supported.
If a numpy array of dtype
numpy.float64
with shape(n,)
(forn
coordinate pairs) is provided in result, then this preallocated array is filled. This can speed up calculations.Parameters: - coords1 (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
for one half of a single orn
bonds, respectively (dtype is arbitrary, will be converted tonumpy.float32
internally). - coords2 (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
for the other half of a single orn
bonds, respectively (dtype is arbitrary, will be converted tonumpy.float32
internally). - box (numpy.ndarray, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - result (numpy.ndarray, optional) – Preallocated result array of dtype
numpy.float64
and shape(n,)
(forn
coordinate pairs). Avoids recreating the array in repeated function calls. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: bondlengths – Array containing the bond lengths between each pair of coordinates. If two single coordinates were supplied, their distance is returned as a single number instead of an array.
Return type: numpy.ndarray (
dtype=numpy.float64
,shape=(n,)
) or numpy.float64New in version 0.8.
Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts single coordinates as input.- coords1 (numpy.ndarray) – Coordinate array of shape
-
MDAnalysis.lib.distances.
calc_angles
(coords1, coords2, coords3, box=None, result=None, backend='serial')[source]¶ Calculates the angles formed between triplets of atom positions from the three coordinate arrays coords1, coords2, and coords3. All coordinate arrays must contain the same number of coordinates.
The coordinates in coords2 represent the apices of the angles:
2---3 / 1
Configurations where the angle is undefined (e.g., when coordinates 1 or 3 of a triplet coincide with coordinate 2) result in a value of zero for that angle.
If the optional argument box is supplied, periodic boundaries are taken into account when constructing the connecting vectors between coordinates, i.e., the minimum image convention is applied for the vectors forming the angles. Either orthogonal or triclinic boxes are supported.
If a numpy array of dtype
numpy.float64
with shape(n,)
(forn
coordinate triplets) is provided in result, then this preallocated array is filled. This can speed up calculations.Parameters: - coords1 (numpy.ndarray) – Array of shape
(3,)
or(n, 3)
containing the coordinates of one side of a single orn
angles, respectively (dtype is arbitrary, will be converted tonumpy.float32
internally) - coords2 (numpy.ndarray) – Array of shape
(3,)
or(n, 3)
containing the coordinates of the apices of a single orn
angles, respectively (dtype is arbitrary, will be converted tonumpy.float32
internally) - coords3 (numpy.ndarray) – Array of shape
(3,)
or(n, 3)
containing the coordinates of the other side of a single orn
angles, respectively (dtype is arbitrary, will be converted tonumpy.float32
internally) - box (numpy.ndarray, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - result (numpy.ndarray, optional) – Preallocated result array of dtype
numpy.float64
and shape(n,)
(forn
coordinate triplets). Avoids recreating the array in repeated function calls. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: angles – Array containing the angles between each triplet of coordinates. Values are returned in radians (rad). If three single coordinates were supplied, the angle is returned as a single number instead of an array.
Return type: numpy.ndarray (
dtype=numpy.float64
,shape=(n,)
) or numpy.float64New in version 0.8.
Changed in version 0.9.0: Added optional box argument to account for periodic boundaries in calculation
Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts single coordinates as input.- coords1 (numpy.ndarray) – Array of shape
-
MDAnalysis.lib.distances.
calc_dihedrals
(coords1, coords2, coords3, coords4, box=None, result=None, backend='serial')[source]¶ Calculates the dihedral angles formed between quadruplets of positions from the four coordinate arrays coords1, coords2, coords3, and coords4, which must contain the same number of coordinates.
The dihedral angle formed by a quadruplet of positions (1,2,3,4) is calculated around the axis connecting positions 2 and 3 (i.e., the angle between the planes spanned by positions (1,2,3) and (2,3,4)):
4 | 2-----3 / 1
If all coordinates lie in the same plane, the cis configuration corresponds to a dihedral angle of zero, and the trans configuration to \(\pi\) radians (180 degrees). Configurations where the dihedral angle is undefined (e.g., when all coordinates lie on the same straight line) result in a value of
nan
(not a number) for that dihedral.If the optional argument box is supplied, periodic boundaries are taken into account when constructing the connecting vectors between coordinates, i.e., the minimum image convention is applied for the vectors forming the dihedral angles. Either orthogonal or triclinic boxes are supported.
If a numpy array of dtype
numpy.float64
with shape(n,)
(forn
coordinate quadruplets) is provided in result then this preallocated array is filled. This can speed up calculations.Parameters: - coords1 (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
containing the 1st positions in dihedrals (dtype is arbitrary, will be converted tonumpy.float32
internally) - coords2 (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
containing the 2nd positions in dihedrals (dtype is arbitrary, will be converted tonumpy.float32
internally) - coords3 (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
containing the 3rd positions in dihedrals (dtype is arbitrary, will be converted tonumpy.float32
internally) - coords4 (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
containing the 4th positions in dihedrals (dtype is arbitrary, will be converted tonumpy.float32
internally) - box (numpy.ndarray, optional) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - result (numpy.ndarray, optional) – Preallocated result array of dtype
numpy.float64
and shape(n,)
(forn
coordinate quadruplets). Avoids recreating the array in repeated function calls. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: dihedrals – Array containing the dihedral angles formed by each quadruplet of coordinates. Values are returned in radians (rad). If four single coordinates were supplied, the dihedral angle is returned as a single number instead of an array.
Return type: numpy.ndarray (
dtype=numpy.float64
,shape=(n,)
) or numpy.float64New in version 0.8.
Changed in version 0.9.0: Added optional box argument to account for periodic boundaries in calculation
Changed in version 0.11.0: Renamed from calc_torsions to calc_dihedrals
Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts single coordinates as input.- coords1 (numpy.ndarray) – Coordinate array of shape
-
MDAnalysis.lib.distances.
apply_PBC
(coords, box, backend='serial')[source]¶ Moves coordinates into the primary unit cell.
Parameters: - coords (numpy.ndarray) – Coordinate array of shape
(3,)
or(n, 3)
(dtype is arbitrary, will be converted tonumpy.float32
internally). - box (numpy.ndarray) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: newcoords – Array containing coordinates that all lie within the primary unit cell as defined by box.
Return type: numpy.ndarray (
dtype=numpy.float32
,shape=coords.shape
)New in version 0.8.
Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts (and, likewise, returns) single coordinates.- coords (numpy.ndarray) – Coordinate array of shape
-
MDAnalysis.lib.distances.
transform_RtoS
(coords, box, backend='serial')[source]¶ Transform an array of coordinates from real space to S space (a.k.a. lambda space)
S space represents fractional space within the unit cell for this system.
Reciprocal operation to
transform_StoR()
.Parameters: - coords (numpy.ndarray) – A
(3,)
or(n, 3)
array of coordinates (dtype is arbitrary, will be converted tonumpy.float32
internally). - box (numpy.ndarray) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: newcoords – An array containing fractional coordiantes.
Return type: numpy.ndarray (
dtype=numpy.float32
,shape=coords.shape
)Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts (and, likewise, returns) a single coordinate.- coords (numpy.ndarray) – A
-
MDAnalysis.lib.distances.
transform_StoR
(coords, box, backend='serial')[source]¶ Transform an array of coordinates from S space into real space.
S space represents fractional space within the unit cell for this system.
Reciprocal operation to
transform_RtoS()
Parameters: - coords (numpy.ndarray) – A
(3,)
or(n, 3)
array of coordinates (dtype is arbitrary, will be converted tonumpy.float32
internally). - box (numpy.ndarray) –
The unitcell dimensions of the system, which can be orthogonal or triclinic and must be provided in the same format as returned by
MDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
. - backend ({'serial', 'OpenMP'}, optional) – Keyword selecting the type of acceleration.
Returns: newcoords – An array containing real space coordiantes.
Return type: numpy.ndarray (
dtype=numpy.float32
,shape=coords.shape
)Changed in version 0.13.0: Added backend keyword.
Changed in version 0.19.0: Internal dtype conversion of input coordinates to
numpy.float32
. Now also accepts (and, likewise, returns) a single coordinate.- coords (numpy.ndarray) – A
-
MDAnalysis.lib.distances.
augment_coordinates
(coordinates, box, r)¶ Calculates the periodic images of particles which are within a distance r from the box walls.
The algorithm works by generating explicit periodic images of atoms residing close to any of the six box walls. The steps involved in generating images involves the evaluation of reciprocal box vectors followed by the calculation of distances of atoms from the walls by means of projection onto the reciprocal vectors. If the distance is less than a specified cutoff distance, relevant periodic images are generated using box translation vectors \(\vec{t}\) with
\[\vec{t}=l\cdot\vec{a}+m\cdot\vec{b}+n\cdot \vec{c}\,,\]where \(l,\,m,\,n \in \{-1,\,0,\,1\}\) are the neighboring cell indices in \(x\)-, \(y\)-, and \(z\)-direction relative to the central cell with box vectors \(\vec{a},\,\vec{b},\,\vec{c}\).
For instance, an atom close to the \(xy\)-plane containing the origin will generate a periodic image outside the central cell and close to the opposite \(xy\)-plane of the box, i.e., shifted by \(\vec{t} = 0\cdot\vec{a}+0\cdot\vec{b}+1\cdot\vec{c}=\vec{c}\).
Likewise, if the particle is close to more than one box walls, images along the diagonals are also generated:
x x +------------+ +------------+ | | augment | | | | -------> | | | o | x | o | +------------+ +------------+
Parameters: - coordinates (numpy.ndarray) – Input coordinate array of shape
(n, 3)
and dtypenumpy.float32
used to generate duplicate images in the vicinity of the central cell. All coordinates must be within the primary unit cell. - box (numpy.ndarray) – Box dimensions of shape
(6,)
and dtypenumpy.float32
. The dimensions must be provided in the same format as returned byMDAnalysis.coordinates.base.Timestep.dimensions
:[lx, ly, lz, alpha, beta, gamma]
- r (float) – Thickness of cutoff region for duplicate image generation.
Returns: - output (numpy.ndarray) – Coordinates of duplicate (augmented) particles (dtype
numpy.float32
). - indices (numpy.ndarray) – Original indices of the augmented coordinates (dtype
numpy.int64
). Maps the indices of augmented particles to their original particle index such thatindices[augmented_index] = original_index
.
Note
Output does not return coordinates from the initial array. To merge the particles with their respective images, the following operation is necessary when generating the images:
images, mapping = augment_coordinates(coordinates, box, max_cutoff) all_coords = numpy.concatenate([coordinates, images])
See also
New in version 0.19.0.
- coordinates (numpy.ndarray) – Input coordinate array of shape
-
MDAnalysis.lib.distances.
undo_augment
(results, translation, nreal)¶ Translate augmented indices back to original indices.
Parameters: - results (numpy.ndarray) – Array of dtype
numpy.int64
containing coordinate indices, including “augmented” indices. - translation (numpy.ndarray) – Index map of dtype
numpy.int64
linking the augmented indices to the original particle indices such thattranslation[augmented_index] = original_index
. - nreal (int) – Number of real coordinates, i.e., indices in results equal or larger than this need to be mapped to their real counterpart.
Returns: results – Modified input results with all the augmented indices translated to their corresponding initial original indices.
Return type: Note
Modifies the results array in place.
See also
New in version 0.19.0.
- results (numpy.ndarray) – Array of dtype
13.2.2. Low-level modules for MDAnalysis.lib.distances
¶
MDAnalysis.lib._distances
contains low level access to the
serial MDAnalysis Cython functions in distances. These have
little to no error checking done on inputs so should be used with
caution. Similarly, MDAnalysis.lib._distances_openmp
contains
the OpenMP-enable functions.