4.2.2. Analysis backends — `MDAnalysis.analysis.backends`

Added in version 2.8.0.

The backends module provides BackendBase base class to implement custom execution backends for MDAnalysis.analysis.base.AnalysisBase.run() and its subclasses.

4.2.2.1. Backends

Three built-in backend classes are provided:

serial: BackendSerial, that is equivalent to using no parallelization and is the default
multiprocessing: BackendMultiprocessing that supports parallelization via standard Python multiprocessing module and uses default pickle serialization
dask: BackendDask, that uses the same process-based parallelization as BackendMultiprocessing, but different serialization algorithm via dask (see dask serialization algorithms for details)

4.2.2.2. Classes

class MDAnalysis.analysis.backends.BackendBase(n_workers: int)[source]

Base class for backend implementation.

Initializes an instance and performs checks for its validity, such as n_workers and possibly other ones.

Parameters:: n_workers (int) – number of workers (usually, processes) over which the work is split

Examples

from MDAnalysis.analysis.backends import BackendBase

class ThreadsBackend(BackendBase):
    def apply(self, func, computations):
        from multiprocessing.dummy import Pool

        with Pool(processes=self.n_workers) as pool:
            results = pool.map(func, computations)
        return results

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import PSF, DCD
from MDAnalysis.analysis.rms import RMSD

u = mda.Universe(PSF, DCD)
ref = mda.Universe(PSF, DCD)

R = RMSD(u, ref)

n_workers = 2
backend = ThreadsBackend(n_workers=n_workers)
R.run(backend=backend, unsupported_backend=True)

Warning

Using ThreadsBackend above will lead to erroneous results, since it is an educational example. Do not use it for real analysis.

Added in version 2.8.0.

_get_checks()[source]

Get dictionary with condition: error_message pairs that ensure the validity of the backend instance

Returns:: dictionary with condition: error_message pairs that will get checked during _validate() run
Return type:: dict

_get_warnings()[source]

Get dictionary with condition: warning_message pairs that ensure the good usage of the backend instance

Returns:: dictionary with condition: warning_message pairs that will get checked during _validate() run
Return type:: dict

_validate()[source]

Check correctness (e.g. dask is installed if using backend='dask') and good usage (e.g. n_workers=1 if backend is serial) of the backend

Raises:: ValueError – if one of the conditions in _get_checks() is True

apply(func: Callable, computations: list) → list[source]

map function func to all tasks in the computations list

Main method that will get called when using an instance of BackendBase. It is equivalent to running [func(item) for item in computations] while using the parallel backend capabilities.

Parameters:

func (Callable) – function to be called on each of the tasks in computations list
computations (list) – computation tasks to apply function to

Returns:

list of results of the function

Return type:

list

class MDAnalysis.analysis.backends.BackendDask(n_workers: int)[source]

A built-in backend that executes a given function with dask.

Execution is performed with the dask.compute() function of dask.delayed.Delayed object (created with dask.delayed.delayed()) with scheduler='processes' and chunksize=1 (this ensures uniform distribution of tasks among processes). Requires the dask package to be installed.

Parameters:: n_workers (int) – number of processes in to distribute the workload between. Must be a positive integer. Workers are actually multiprocessing.pool.Pool processes, but they use a different and more flexible serialization protocol.

Examples

from MDAnalysis.analysis.backends import BackendDask
import multiprocessing as mp

backend_obj = BackendDask(n_workers=mp.cpu_count())

Added in version 2.8.0.

_get_checks()[source]

Get dictionary with condition: error_message pairs that ensure the validity of the backend instance. Here checks if dask module is installed in the environment.

Returns:: dictionary with condition: error_message pairs that will get checked during _validate() run
Return type:: dict

apply(func: Callable, computations: list) → list[source]

Applies func to each object in computations.

Parameters:

func (Callable) – function to be called on each of the tasks in computations list
computations (list) – computation tasks to apply function to

Returns:

list of results of the function

Return type:

list

class MDAnalysis.analysis.backends.BackendMultiprocessing(n_workers: int)[source]

A built-in backend that executes a given function using the multiprocessing.Pool.map method.

Parameters:: n_workers (int) – number of processes in multiprocessing.Pool to distribute the workload between. Must be a positive integer.

Examples

from MDAnalysis.analysis.backends import BackendMultiprocessing
import multiprocessing as mp

backend_obj = BackendMultiprocessing(n_workers=mp.cpu_count())

Added in version 2.8.0.

apply(func: Callable, computations: list) → list[source]

Applies func to each object in computations using multiprocessing’s Pool.map.

Parameters:

func (Callable) – function to be called on each of the tasks in computations list
computations (list) – computation tasks to apply function to

Returns:

list of results of the function

Return type:

list

class MDAnalysis.analysis.backends.BackendSerial(n_workers: int)[source]

A built-in backend that does serial execution of the function, without any parallelization.

Parameters:: n_workers (int) – Is ignored in this class, and if n_workers > 1, a warning will be given.

Added in version 2.8.0.

_get_warnings()[source]

Get dictionary with condition: warning_message pairs that ensure the good usage of the backend instance. Here, it checks if the number of workers is not 1, otherwise gives warning.

Returns:: dictionary with condition: warning_message pairs that will get checked during _validate() run
Return type:: dict

apply(func: Callable, computations: list) → list[source]

Serially applies func to each task object in computations.

Parameters:

func (Callable) – function to be called on each of the tasks in computations list
computations (list) – computation tasks to apply function to

Returns:

list of results of the function

Return type:

list

4.2.2. Analysis backends — MDAnalysis.analysis.backends

4.2.2.1. Backends

4.2.2.2. Classes

4.2.2. Analysis backends — `MDAnalysis.analysis.backends`