correlation_shift¶
- mdhelper.algorithm.correlation.correlation_shift(arr1: ndarray[float], arr2: ndarray[float] = None, axis: int = None, *, average: bool = False, double: bool = False, vector: bool = False) ndarray[float] [source]¶
Evaluates the autocorrelation function (ACF) or cross-correlation function (CCF) of a time series directly by using sliding windows along the time axis.
For scalars \(r\) or vectors \(\mathbf{r}\), the ACF is defined as
\[A(\tau)=\langle\textbf{r}(t_0+\tau)\cdot\textbf{r}(t_0)\rangle =\dfrac{1}{N}\sum_{\alpha=1}^N \textbf{r}_\alpha(t_0+\tau)\cdot\textbf{r}_\alpha(t_0)\]while the CCF for species \(i\) and \(j\) is given by
\[C_{ij}(\tau)=\langle\textbf{r}_i(t_0+\tau)\cdot \textbf{r}_j(t_0)\rangle =\dfrac{1}{N}\sum_{\alpha=1}^N\textbf{r}_{i,\alpha}(t_0+\tau)\cdot \textbf{r}_{j,\alpha}(t_0)\]where \(\tau\) is the time lag, \(t_0\) is an arbitrary reference time, and \(N\) is the number of entities. To reduce statistical noise, the ACF/CCF is calculated for and averaged over all possible reference times \(t_0\). As such, this algorithm has a time complexity of \(\mathcal{O}(N^2)\).
With large data sets, this approach is too slow to be useful. If your machine supports the fast Fourier transform (FFT), consider using the much more performant FFT-based algorithm implemented in
mdhelper.algorithm.correlation.correlation_fft()
instead.- Parameters:
- arr1numpy.ndarray
Time evolution of \(N\) entities over \(N_\mathrm{b}\) blocks of \(N_t\) frames each.
Shape:
Scalar: \((N_t,)\), \((N_t,\,N)\), \((N_\mathrm{b},\,N_t)\), or \((N_\mathrm{b},\,N_t,\,N)\).
Vector: \((N_t,\,d)\), \((N_t,\,N,\,N_\mathrm{d})\), \((N_\mathrm{b},\,N_t,\,N_\mathrm{d})\), or \((N_\mathrm{b},\,N_t,\,N,\,N_\mathrm{d})\), where \(N_\mathrm{d}\) is the number of dimensions each vector has.
- arr2numpy.ndarray, optional
Time evolution of another \(N\) entities. If provided, the CCF for arr1 and arr2 is calculated. Otherwise, the ACF for arr1 is calculated.
Shape: Same as arr1.
- axisint, optional
Axis along which to evaluate the ACF/CCF. If arr1 contains a full, unsplit trajectory, the ACF/CCF should be evaluated along the first axis (
axis=0
). If arr1 contains a trajectory split into multiple blocks, the ACF/CCF should be evaluated along the second axis (axis=1
). If not specified, the axis is determined automatically using the shape of arr1.- averagebool, keyword-only, default:
True
Determines whether the ACF/CCF is averaged over all entities if the arrays contain information for multiple entities.
- doublebool, keyword-only, default:
False
If
True
, the ACF is doubled or the CCFs for the negative and positive time lags are combined. Useful for evaluating the mean squared or cross displacement. Seemdhelper.algorithm.correlation.msd_shift()
for more information.- vectorbool, keyword-only, default:
False
Specifies whether arr1 and arr2 contain vectors. If
True
, the ACF/CCF is summed over the last dimension.
- Returns:
- corrnumpy.ndarray
Autocorrelation or cross-correlation function.
Shape:
For ACF, the shape is that of arr1 but with the following modifications:
If
average=True
, the axis containing the \(N\) entities is removed.If
vector=True
, the last dimension is removed.
For CCF, the shape is that of arr1 but with the following modifications:
If
average=True
, the axis containing the \(N\) entities is removed.If
double=False
, the axis containing the \(N_t\) times now has a length of \(2N_t-1\) to accomodate negative and positive time lags.If
vector=True
, the last dimension is removed.