NonLinear Correlation Measures

 

Basically two measures of nonlinear correlation are included, the bicorrelation and the mutual information. The bicorrelation, or three-point autocorrelation, or higher order correlation, is the joint moment of three variables formed from the time series and two delays t and s. A simplified scenario for the delays is implemented, s = 2t, so the bicorrelation is E[x(i), x(i+t), x(i+2t)], where the mean value is estimated by the sample average. In this way, the bicorrelation is a function of a single delay t. The mutual information is defined for two variables X and Y as the amount of information that is known for the one variable when the other is given, and it is computed from the joint and single entropies of X and Y. For time series, X = x(i) and  Y = x(i+t) for a delay t, so that the mutual information is a function of the delay t that measures the linear and nonlinear autocorrelation. There are a number of estimators of mutual information based on histograms, kernels, nearest neighbors and splines. Here, the histogram-based estimator is implemented using either equidistant or equiprobable binning. Note that the nonlinear correlation (bicorrelation, equisitant and equiprobable mutual information) at each delay t is a different measure, so that for a range of delay values the same number of measures are generated.

In addition to the nonlinear correlation measures of bicorrelation, equisitant and equiprobable mutual information, the cumulative nonlinear correlation measures are implemented as new measures that sum up the magnitude of the nonlinear correlation over the delays up to a given delay.

The first minimum of the mutual information function is of special interest and the respective delay can be used either as a discriminating measure or as the optimal delay for state space reconstruction in time series analysis. This specific delay is computed for both the equidistant and the equiprobable estimate of mutual information.

The bicorrelation is not a widely discussed measure, but the cumulative bicorrelation has been used as a statistic for the test of linearity (or non-linearity as it is best known in the dynamical systems approach of time series analysis), the so-called Hinnich test, see

Hinich M.J. (1996), Testing for Dependence in the Input to a Linear Time Series Model, Journal of Nonparametric Statistics, Vol 6, pp 205-221.

The bicorrelation, under the name three point autocorrelation, has been used as a simple nonlinear measure in the surrogate data test for nonlinearity, see

Schreiber T. and Schmitz A. (1997), Discrimination Power of Measures for Nonlinearity in a Time Series, Physical Review E, Vol 55, No 5, pp 5443-5447.

Kugiumtzis, D. (2001), On the Reliability of the Surrogate Data Test for Nonlinearity in the Analysis of Noisy Time Series, International Journal of Bifurcation and Chaos, Vol 11, No 7, pp 1881-1896.

The mutual information has gained much attention in time series analysis, and a number of papers have been written proposing or comparing different estimates. The first paper on mutual information estimation in time series is

Fraser A.M. and Swinney H. (1986), Independent Coordinates for Strange Attractors from Mutual Information, Physical Review A, Vol 33, pp 1134-1140.

There are advantages and disadvantages of all estimates, and we implemented here the histogram-based estimates of equidistant and equiprobable binning because there are the most standard and widely used in the literature. For a discussion on the different estimates of mutual information on time series from nonlinear dynamical systems see

Papana A. and Kugiumtzis D. (2008), Evaluation of Mutual Information Estimators on Nonlinear Dynamic Systems, Complex Phenomena in Nonlinear Systems, submitted.

 

Bicorrelation (Bicorrelat)

Bicorrelation is the extension of the standard Pearson autocorrelation to three variables x(i), x(i+t), and x(i+2t), and it is computed for the given range of the delay t. The following parameter can be specified:

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10.

In addition, the Cumulative Bicorrelation is computed for the same range of delays. The cumulative bicorrelation can then be simply assigned to the respective measure, if it is selected with the same set of delay values.

Note that when the delay parameter is changed, the change is passed to the same parameter in the measure of Cumulative Bicorrelation.

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for delay (t) '1:5 10 20', then Bicorrelation is computed for these delays and in the measure list the following measure names will appear

Bicorrelat1
Bicorrelat2
Bicorrelat3
Bicorrelat4
Bicorrelat5
Bicorrelat10
Bicorrelat20

 

Cumulative Bicorrelation (Bicorrela)

Cumulative Bicorrelation is the cumulative function of the bicorrelation for the given range of delays. The delay parameter is determined as for the Bicorrelation. The Cumulative Bicorrelation for each delay t is the sum of the absolute values of the Bicorrelation up to the delay t.

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for delay (t) '10 20', then Cumulative Bicorrelation is computed for these delays and in the measure list the following measure names will appear

PearsCAutot10
PearsCAutoct20

 

Mutual Information, equidistant bins (MutInfEqDi)

Mutual Information, equidistant bins, is the estimate of the mutual information based on histograms of equidistant bins. The estimate is computed for the given range of the delay t. The following parameter can be specified:

- number of bins (b): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '0' and it is used to denote the number of bins set to the rounded integer of sqrt(N/5), where N is the length of the time series. Note that b = 1 is meaningless as at least two bins should be given to split the range of values.

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10.

In addition, the cumulative mutual information and the delay of the first minimum of mutual information (both for equidistant bins) are computed for the same range of delays or/and number of bins. These two other measures can then be assigned to the respective measures, if these are selected with the same set of parameters (b and t).

Note that when the either b or t parameter is changed, the change is passed to the same parameter in the measures of Cumulative Mutual Information and the delay of the First Minimum of Mutual Information (both for equidistant bins).

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for number of bins (b) '0 5' and for delay (t) '1:5:20', then Mutual Information is computed for the combinations of the 2 values of b and the 4 values of t and in the measure list the following measure names will appear (for b=0 the number of bins depends on the length of each time series in the current time series list)

MutInfEqDib0t1
MutInfEqDib0t6
MutInfEqDib0t11
MutInfEqDib0t16
MutInfEqDib5t1
MutInfEqDib5t6
MutInfEqDib5t11
MutInfEqDib5t16

 

Cumulative Mutual Information, equidistant bins (MutInCEqDi)

Cumulative Mutual Information is the cumulative function of the mutual information using equidistant bins for the given range of delays. The number of bins b and the delay t are determined as for the Mutual Information, equidistant bins. The Cumulative Mutual Information for each delay t is the sum of the values of the Mutual Information up to the delay t. The following parameter can be specified:

- number of bins (b): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '0' and it is used to denote the number of bins set to the rounded integer of sqrt(N/5), where N is the length of the time series. Note that b = 1 is meaningless as at least two bins should be given to split the range of values.

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10.

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for number of bins (b) '0' and for delay (t) '10 20', then Cumulative Mutual Information is computed for the 2 delays using a number of bins that depends on the length of each time series in the current time series list, and in the measure list the following measure names will appear

MutInCEqDib0t10
MutInCEqDib0t20

 

First Minimum of Mutual Information, equidistant bins (MutMinEqDi)

The First Minimum of Mutual Information, equidistant bins, is the delay for which the Mutual Information (computed using equidistant bins) falls to its first minimum. So, if the mutual information for delays up to the maximum of the given delays (if more than one delay value is given) reaches its minimum for the first time for a delay t, then the First Minimum of Mutual Information, equidistant bins, is assigned to t otherwise it has a NaN value (NaN = not a number). The following parameter can be specified:

- number of bins (b): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '0' and it is used to denote the number of bins set to the rounded integer of sqrt(N/5), where N is the length of the time series. Note that b = 1 is meaningless as at least two bins should be given to split the range of values.

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10, but only one measure will be given to the output, for delay 10, the maximum of the given delays.

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for number of bins (b) '0 5' and for delay (t) '10 20', then First Minimum of Mutual Information, equidistant bins, is computed and in the measure list the following measure names will appear (for the maximum t value), the first using a number of bins that depends on the length of each time series in the current time series list and the second using b=5,

MutMinEqDib0t20
MutMinEqDib5t20

 

Mutual Information, equiprobable bins (MutInfEqPr)

Mutual Information, equiprobable bins, is the estimate of the mutual information based on histograms of equiprobable bins, i.e. of bins containing all the same proportion of data points. The estimate is computed for the given range of the delay t. The following parameter can be specified:

- number of bins (b): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '0' and it is used to denote the number of bins set to the rounded integer of sqrt(N/5), where N is the length of the time series. Note that b = 1 is meaningless as at least two bins should be given to split the range of values.

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10.

In addition the cumulative mutual information and the delay of the first minimum of mutual information (both for equiprobable bins) are computed for the same range of delays or/and number of bins. These two other measures can then be assigned to the respective measures, if these are selected with the same set of parameters (b and t).

Note that when the either b or t parameter is changed, the change is passed to the same parameter in the measures of Cumulative Mutual Information and the delay of the First Minimum of Mutual Information (both for equiprobable bins).

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for number of bins (b) '0 5' and for delay (t) '1:5:20', then Mutual Information is computed for the combinations of the 2 values of b and the 4 values of t and in the measure list the following measure names will appear (for b=0 the number of bins depends on the length of each time series in the current time series list)

MutInfEqDib0t1
MutInfEqDib0t6
MutInfEqDib0t11
MutInfEqDib0t16
MutInfEqDib5t1
MutInfEqDib5t6
MutInfEqDib5t11
MutInfEqDib5t16

 

Cumulative Mutual Information, equiprobable bins (MutInCEqPr)

Cumulative Mutual Information is the cumulative function of the mutual information using equiprobable bins for the given range of delays. The number of bins b and the delay t are determined as for the Mutual Information, equiprobable bins. The Cumulative Mutual Information for each delay t is the sum of the values of the Mutual Information up to the delay t. The following parameter can be specified:

- number of bins (b): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '0' and it is used to denote the number of bins set to the rounded integer of sqrt(N/5), where N is the length of the time series. Note that b = 1 is meaningless as at least two bins should be given to split the range of values.

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10.

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for number of bins (b) '0' and for delay (t) '10 20', then Cumulative Mutual Information is computed for the 2 delays and in the measure list the following measure names will appear

MutInCEqDib0t10
MutInCEqDib0t20

 

First Minimum of Mutual Information, equiprobable bins (MutMinEqPr)

The First Minimum of Mutual Information, equiprobable bins, is the delay for which the Mutual Information (computed using equiprobable bins) falls to its first minimum. So, if the mutual information for delays up to the maximum of the given delays (if more than one delay value is given) reaches its minimum for the first time for a delay t, then the First Minimum of Mutual Information, equiprobable bins, is assigned to t otherwise it has a NaN value (not a number). The following parameter can be specified:

- number of bins (b): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '0' and it is used to denote the number of bins set to the rounded integer of sqrt(N/5), where N is the length of the time series. Note that b = 1 is meaningless as at least two bins should be given to split the range of values.

- delay (t): any valid matlab format denoting an array of positive integers or a single positive integer. The default is '1:10' meaning delays from 1 to 10, but only one measure will be given to the output, for delay 10, the maximum of the given delays.

Example: If the user selects this measure by activating the check box in the beginning of the measure line and sets for number of bins (b) '0 5' and for delay (t) '10 20', then First Minimum of Mutual Information, equiprobable bins, is computed and in the measure list the following measure names will appear (for the maximum t value, for b=0 the number of bins depends on the length of each time series in the current time series list)

MutMinEqDib0t20
MutMinEqDib5t20

 

OK

By pressing this button the window of "NonLinear Correlation Measures" will disappear and the user will be moved to the "Select / run measures" window. Any changes in the measures and parameter values will be stored.
 

Cancel

Quit without doing anything and return to the "Select / run measures" window. Any changes in the measures and parameter values will be ignored.
 

Help

This file will be shown.