correlation_shift¶

mdhelper.algorithm.correlation.correlation_shift(arr1: ndarray[float], arr2: ndarray[float] = None, axis: int = None, *, average: bool = False, double: bool = False, vector: bool = False) → ndarray[float][source]¶

Evaluates the autocorrelation function (ACF) or cross-correlation function (CCF) of a time series directly by using sliding windows along the time axis.

For scalars \(r\) or vectors \(\mathbf{r}\), the ACF is defined as

\[A(\tau)=\langle\textbf{r}(t_0+\tau)\cdot\textbf{r}(t_0)\rangle =\dfrac{1}{N}\sum_{\alpha=1}^N \textbf{r}_\alpha(t_0+\tau)\cdot\textbf{r}_\alpha(t_0)\]

while the CCF for species \(i\) and \(j\) is given by

\[C_{ij}(\tau)=\langle\textbf{r}_i(t_0+\tau)\cdot \textbf{r}_j(t_0)\rangle =\dfrac{1}{N}\sum_{\alpha=1}^N\textbf{r}_{i,\alpha}(t_0+\tau)\cdot \textbf{r}_{j,\alpha}(t_0)\]

where \(\tau\) is the time lag, \(t_0\) is an arbitrary reference time, and \(N\) is the number of entities. To reduce statistical noise, the ACF/CCF is calculated for and averaged over all possible reference times \(t_0\). As such, this algorithm has a time complexity of \(\mathcal{O}(N^2)\).

With large data sets, this approach is too slow to be useful. If your machine supports the fast Fourier transform (FFT), consider using the much more performant FFT-based algorithm implemented in mdhelper.algorithm.correlation.correlation_fft() instead.

Parameters:

arr1numpy.ndarray

Time evolution of \(N\) entities over \(N_\mathrm{b}\) blocks of \(N_t\) frames each.

Shape:

Scalar: \((N_t,)\), \((N_t,\,N)\), \((N_\mathrm{b},\,N_t)\), or \((N_\mathrm{b},\,N_t,\,N)\).
Vector: \((N_t,\,d)\), \((N_t,\,N,\,N_\mathrm{d})\), \((N_\mathrm{b},\,N_t,\,N_\mathrm{d})\), or \((N_\mathrm{b},\,N_t,\,N,\,N_\mathrm{d})\), where \(N_\mathrm{d}\) is the number of dimensions each vector has.

arr2numpy.ndarray, optional

Time evolution of another \(N\) entities. If provided, the CCF for arr1 and arr2 is calculated. Otherwise, the ACF for arr1 is calculated.

Shape: Same as arr1.

axisint, optional

Axis along which to evaluate the ACF/CCF. If arr1 contains a full, unsplit trajectory, the ACF/CCF should be evaluated along the first axis (axis=0). If arr1 contains a trajectory split into multiple blocks, the ACF/CCF should be evaluated along the second axis (axis=1). If not specified, the axis is determined automatically using the shape of arr1.

averagebool, keyword-only, default: True

Determines whether the ACF/CCF is averaged over all entities if the arrays contain information for multiple entities.

doublebool, keyword-only, default: False

If True, the ACF is doubled or the CCFs for the negative and positive time lags are combined. Useful for evaluating the mean squared or cross displacement. See mdhelper.algorithm.correlation.msd_shift() for more information.

vectorbool, keyword-only, default: False

Specifies whether arr1 and arr2 contain vectors. If True, the ACF/CCF is summed over the last dimension.

Returns:

corrnumpy.ndarray

Autocorrelation or cross-correlation function.

Shape:

For ACF, the shape is that of arr1 but with the following modifications:

If average=True, the axis containing the \(N\) entities is removed.
If vector=True, the last dimension is removed.

For CCF, the shape is that of arr1 but with the following modifications:

If average=True, the axis containing the \(N\) entities is removed.
If double=False, the axis containing the \(N_t\) times now has a length of \(2N_t-1\) to accomodate negative and positive time lags.
If vector=True, the last dimension is removed.