4.2.2. Analysis backends — MDAnalysis.analysis.backends
Added in version 2.8.0.
The backends
module provides BackendBase
base class to
implement custom execution backends for
MDAnalysis.analysis.base.AnalysisBase.run()
and its
subclasses.
See also
4.2.2.1. Backends
Three built-in backend classes are provided:
serial:
BackendSerial
, that is equivalent to using no parallelization and is the defaultmultiprocessing:
BackendMultiprocessing
that supports parallelization via standard Pythonmultiprocessing
module and uses defaultpickle
serializationdask:
BackendDask
, that uses the same process-based parallelization asBackendMultiprocessing
, but different serialization algorithm via dask (see dask serialization algorithms for details)
4.2.2.2. Classes
- class MDAnalysis.analysis.backends.BackendBase(n_workers: int)[source]
Base class for backend implementation.
Initializes an instance and performs checks for its validity, such as
n_workers
and possibly other ones.- Parameters:
n_workers (int) – number of workers (usually, processes) over which the work is split
Examples
from MDAnalysis.analysis.backends import BackendBase class ThreadsBackend(BackendBase): def apply(self, func, computations): from multiprocessing.dummy import Pool with Pool(processes=self.n_workers) as pool: results = pool.map(func, computations) return results import MDAnalysis as mda from MDAnalysis.tests.datafiles import PSF, DCD from MDAnalysis.analysis.rms import RMSD u = mda.Universe(PSF, DCD) ref = mda.Universe(PSF, DCD) R = RMSD(u, ref) n_workers = 2 backend = ThreadsBackend(n_workers=n_workers) R.run(backend=backend, unsupported_backend=True)
Warning
Using ThreadsBackend above will lead to erroneous results, since it is an educational example. Do not use it for real analysis.
Added in version 2.8.0.
- _get_checks()[source]
Get dictionary with
condition: error_message
pairs that ensure the validity of the backend instance- Returns:
dictionary with
condition: error_message
pairs that will get checked during_validate()
run- Return type:
- _get_warnings()[source]
Get dictionary with
condition: warning_message
pairs that ensure the good usage of the backend instance- Returns:
dictionary with
condition: warning_message
pairs that will get checked during_validate()
run- Return type:
- _validate()[source]
Check correctness (e.g.
dask
is installed if usingbackend='dask'
) and good usage (e.g.n_workers=1
if backend is serial) of the backend- Raises:
ValueError – if one of the conditions in
_get_checks()
isTrue
- class MDAnalysis.analysis.backends.BackendDask(n_workers: int)[source]
A built-in backend that executes a given function with dask.
Execution is performed with the
dask.compute()
function ofdask.delayed.Delayed
object (created withdask.delayed.delayed()
) withscheduler='processes'
andchunksize=1
(this ensures uniform distribution of tasks among processes). Requires the dask package to be installed.- Parameters:
n_workers (int) – number of processes in to distribute the workload between. Must be a positive integer. Workers are actually
multiprocessing.pool.Pool
processes, but they use a different and more flexible serialization protocol.
Examples
from MDAnalysis.analysis.backends import BackendDask import multiprocessing as mp backend_obj = BackendDask(n_workers=mp.cpu_count())
Added in version 2.8.0.
- _get_checks()[source]
Get dictionary with
condition: error_message
pairs that ensure the validity of the backend instance. Here checks ifdask
module is installed in the environment.- Returns:
dictionary with
condition: error_message
pairs that will get checked during_validate()
run- Return type:
- class MDAnalysis.analysis.backends.BackendMultiprocessing(n_workers: int)[source]
A built-in backend that executes a given function using the
multiprocessing.Pool.map
method.- Parameters:
n_workers (int) – number of processes in
multiprocessing.Pool
to distribute the workload between. Must be a positive integer.
Examples
from MDAnalysis.analysis.backends import BackendMultiprocessing import multiprocessing as mp backend_obj = BackendMultiprocessing(n_workers=mp.cpu_count())
Added in version 2.8.0.
- class MDAnalysis.analysis.backends.BackendSerial(n_workers: int)[source]
A built-in backend that does serial execution of the function, without any parallelization.
- Parameters:
n_workers (int) – Is ignored in this class, and if
n_workers
> 1, a warning will be given.
Added in version 2.8.0.
- _get_warnings()[source]
Get dictionary with
condition: warning_message
pairs that ensure the good usage of the backend instance. Here, it checks if the number of workers is not 1, otherwise gives warning.- Returns:
dictionary with
condition: warning_message
pairs that will get checked during_validate()
run- Return type: