Binning Estimator

NB: for the formalism needed to understand what follows please go to MuTE page and read SOME FORMALISM section

Here we describe an estimation based on fixed state space partitioning. This approach is based on performing uniform quantization of the time series and then estimating the entropy approximating probabilities with the frequency of visitation of the quantized states. A time series y, realization of the generic process Y, is first normalized to have zero mean and unit variance, and then coarse grained spreading its dynamics over \xi quantization levels of amplitude r=(y_{max} - y_{min})/ \xi, where y_{max} and y_{min} represent minimum and maximum values of the normalized series. Quantization assigns to each sample the number of the level to which it belongs, so that the quantized time series y^{\xi} takes values within the alphabet \mathcal{A}=(0,1,\ldots,\xi -1). Uniform quantization of embedding vectors of dimension d builds an uniform partition of the d-dimensional state space into \xi^d disjoint hypercubes of size r, such that all vectors V falling within the same hypercube are associated with the same quantized vector V_{\xi}, and are thus indistinguishable within the tolerance r. The entropy is then estimated as:

(1)   \begin{equation*} H(V_{\xi}) = -\sum_{V_{\xi} \in A^{d}}{p(V_{\xi})log\,p(V_{\xi})} \end{equation*}

where the sum is extended over all vectors found in the available realization of the quantized series, and the probabilities p(V_{\xi}) are estimated for each hypercube simply as the fraction of quantized vectors V_{\xi} falling into the hypercube (i.e., the frequency of occurrence of V_{\xi} within A_{d}). According to this approach, the estimate of TE based on binning results from application of (1) to the four embedding vectors defined in ( equation (2) ) and determined either by UE or by NUE.

In the NUE implementation, maximization of the mutual information between the component \hat{W}_n selected at the step k and the target variable Y_n (step (*) of the algorithm ) was obtained in terms of minimization of the CE H(Y_n|\hat{W}_n,V_n^{(k-1)})=H(Y_n,\hat{W}_n,V_n^{(k-1)})-H(\hat{W}_n,V_n^{(k-1)}), with the two entropy terms estimated through the application of (1). As for the LIN estimator, the randomization procedure applied to test candidate significance consisted time-shifting the points of \hat{W}_n by a randomly selected lag, Quiroga (2002).

The statistical significance of the TE estimated through the UE BIN approach exploited the method of surrogate data implemented by the time-shift procedure proposed in Vlachos (2010), Faes (2008), Quiroga (2002). Specifically, the estimated TE is tested against its null distribution formed by the values of TE computed on replications of the original series, where in each replication the source series is time-shifted by a randomly selected lag (larger than 20, set to exclude autocorrelation effects).


  1. Vlachos, Ioannis, Kugiumtzis, Dimitris: Nonuniform state-space reconstruction and coupling detection. In: Phys Rev E, 82 (1), pp. 016207, 2010.
  2. Faes, Luca, Porta, Alberto, Nollo, Giandomenico: Mutual nonlinear prediction as a tool to evaluate coupling strength and directionality in bivariate time series: comparison among different strategies based on k nearest neighbors. In: Phys Rev E, 78 (2), pp. 026201, 2008.
  3. Quiroga, R Quian, Kraskov, A, Kreuz, T, Grassberger, Peter: Performance of different synchronization measures in real data: a case study on electroencephalographic signals. In: Phys Rev E, 65 (4), pp. 041903, 2002.

Looking for related topics? Search Google