ParallelAnalysisBase

class mdhelper.analysis.base.ParallelAnalysisBase(trajectory: ReaderBase, verbose: bool = False, **kwargs)[source]

Bases: SerialAnalysisBase

A multithreaded analysis base object.

Parameters:
trajectoryMDAnalysis.coordinates.base.ReaderBase

Simulation trajectory.

verbosebool, default: True

Determines whether detailed progress is shown.

**kwargs

Additional keyword arguments to pass to MDAnalysis.analysis.base.AnalysisBase.

Methods

run

Performs the calculation in parallel.

save

Saves results to a binary or archive file in NumPy format.

run(start: int = None, stop: int = None, step: int = None, frames: slice | ndarray[int] = None, verbose: bool = None, n_jobs: int = None, module: str = 'multiprocessing', block: bool = True, method: str = None, **kwargs) ParallelAnalysisBase[source]

Performs the calculation in parallel.

Parameters:
startint, optional

Starting frame for analysis.

stopint, optional

Ending frame for analysis.

stepint, optional

Number of frames to skip between each analyzed frame.

framesslice or array-like, optional

Index or logical array of the desired trajectory frames.

verbosebool, optional

Determines whether detailed progress is shown.

n_jobsint, keyword-only, optional

Number of workers. If not specified, it is automatically set to either the minimum number of workers required to fully analyze the trajectory or the maximum number of CPU threads available.

modulestr, keyword-only, default: "multiprocessing"

Parallelization module to use for analysis.

Valid values: "dask", "joblib", and "multiprocessing".

blockbool, keyword-only, default: True

Determines whether the trajectory is split into smaller blocks that are processed serially in parallel with other blocks. This “split–apply–combine” approach is generally faster since the trajectory attributes do not have to be packaged for each analysis run. Has no effect if module="multiprocessing".

methodstr, keyword-only, optional

Specifies which Dask scheduler, Joblib backend, or multiprocessing start method is used.

**kwargs

Additional keyword arguments to pass to dask.compute(), joblib.Parallel, or multiprocessing.pool.Pool, depending on the value of module.

Returns:
selfParallelAnalysisBase

Parallel analysis base object.

save(file: str | TextIO, archive: bool = True, compress: bool = True, **kwargs) None

Saves results to a binary or archive file in NumPy format.

Parameters:
filestr or file

Filename or file-like object where the data will be saved. If file is a str, the .npy or .npz extension will be appended automatically if not already present.

archivebool, default: True

Determines whether the results are saved to a single archive file. If True, the data is stored in a .npz file. Otherwise, the data is saved to multiple .npy files.

compressbool, default: True

Determines whether the .npz file is compressed. Has no effect when archive=False.

**kwargs

Additional keyword arguments to pass to numpy.save(), numpy.savez(), or numpy.savez_compressed(), depending on the values of archive and compress.