A challenge for physiologists and neuroscientists is to map information transfer between components of the systems that they study at different time-scales, in order to derive important knowledge on structure and function from the analysis of the recorded dynamics.
The components of physiological networks often interact in a nonlinear way and through mechanisms which are in general not completely known. It is then safer that the method of choice for analyzing these interactions does not rely on any model or assumption on the nature of the data and their interactions.
Transfer entropy and Granger causality have emerged as a powerful tool to quantify directed dynamical interactions.
With MuTE I would like to compare different approaches to evaluate transfer entropy, some of them already proposed, some novel, and implement them in a freeware MATLAB toolbox. Applications to simulated and real data will be presented.
Before letting the reader go into the details of the embedding approaches and the entropy estimators we would like to introduce some formalism that we will use from now on.
Let us consider a composite physical system described by a set of interacting dynamical
(sub) systems and suppose that, within the composite system, we are interested in evaluating the information flow from the source system to the destination system , collecting the remaining systems in the vector . We develop our framework under the assumption of stationarity, which allows to perform estimations replacing ensemble averages with time averages (for non-stationary formulations see, e.g., Ledberg (2012), and references therein). Accordingly, we denote , and as the stationary stochastic processes describing the state visited by the systems , and over time, and , and as the stochastic variables obtained sampling the processes at the present time . Moreover, we denote , , and as the infinite-dimensional vector variables representing the whole past of the processes , and . Then, the multivariate transfer entropy (TE) from to conditioned to is defined as:
where the sum extends over all the phase space points forming the trajectory of the composite system. () is then the probability associated with the vector variable while is the probability of observing given that the variables forming the vector are known. The conditional probabilities used in (1) can be interpreted as transition probabilities, in the sense that they describe the dynamics of the transition of the destination system from its past
states to its present state, accounting for the past of the other systems. Utilization of the transition
probabilities makes the resulting measure able to quantify the extent to which the transition of the destination system into its present state is affected by the past states visited by the source system . Specifically, the TE quantifies the information provided by the past of the process about the present of the process that is not already provided by the past of or any other process included in .
The formulation presented in (1) is an extension of the original TE measure proposed for pairwise
systems, Schreiber (2000), to the case of multiple interacting processes. The conditional TE
formulation, also denoted as partial TE, Vakorin (2010), Kugiumtzis (2013), rules out the information shared between and that could be possibly triggered by their common interaction with . Note that the TE can be seen as a difference of two conditional entropies (CE), or equivalently as a sum of four Shannon entropies:
The TE has a great potential in detecting information transfer because it does not assume any particular model that can describe the interactions governing the system dynamics, it is able to discover purely non linear interactions and to deal with a range of interaction delays, Vicente (2011). Recent research has proven that TE is equivalent to Granger Causality (GC) for Gaussianly distributed data, Barnett (2009), Hlavackova (2011). This establishes a convenient joint framework for both measures. Here we evaluate GC in the TE framework and compare a classical VAR model implemented in both versions, UE and NUE, with two model-free approaches.
Take a look at the two embedding versions and their theoretical differences. A comparison between the two approaches is provided
Uniform conditioned embedding schemes take into account the components to be included in the embedding vectors as selected arbitrarily or separately for each variable...go to uniform embedding
According to the non-uniform embedding framework, only the past states that actually help the prediction are entered into the model, improving the prediction and avoiding the risk of overfitting...go to non-uniform embedding
Select the estimator you are interested in. Look at the theoretical contents and how the method performs
The linear estimator method works under the assumption that the processes involved in the analysis have a joint Gaussian distribution. This assumption allows to work with well-known expressions for the probability density functions...go to linear estimator
This approach is based on performing uniform quantization of the time series and then estimating the entropy approximating probabilities with the frequency of visitation of the quantized states...go to binning estimator
Nearest Neighbot Estimator
Since its first introduction in 1967, Cover (1967), the nearest neighbours method has been shown to be a powerful non parametric technique for classification, density estimation, and regression estimation. This method can be used to estimate the entropy of a d-dimensional random variable X, H(x), starting from a random sample...go to nearest neighbor estimator
Neural Network Estimator
Relying on neural networks, the proposed approach to Granger causality will be both non-parametric and based on regression, thus realizing the Granger paradigm in a non-parametric fashion...go to neural network estimator