Click or drag to resize

Least Squares Mutual Information (LSMI)

Mutual Information coefficient covers all kinds of statistical dependencies including nonlinear ones. Mutual Information of two random variables equals zero if and only if they are completely statistical independent.

Use density ratio approximation to approximate alternative version of mutual information between two scalar series. Use Cross Validation to select model parameters.

This topic contains the following sections:

LSMI Algorithm

Mutual information of two continuous random variables X, Y can be defined as:

FA LSMI

As the calculation of continuous MI by definition on limited statistical data set is impossible due to necessity of knowing exact probability distribution functions, we have to use approximations. The approximation of mutual information between two data series is calculated using Least Squares Mutual Information method. LSMI receives two data series and allows to approximate mutual information between them. The method does not involve density estimation and directly models the density ratio:

FA Density Ratio

by the following linear model:

FA Density Ratio Model

where α = (α1, α2, . . . , αb) are parameters to be learned from samples, and φ(x, y) = (φ1(x, y), φ2(x, y), . . . , φb(x, y)) are basis functions such that φ(x, y) ≥ 0b for all (x, y). 0b denotes the b-dimensional vector with all zeros. b is the basis size of the model. The basis functions φ(x, y):

FA PhiXY

The steps of the algorithm are the following (the input parameters of constructors are marked with bold font):

  1. Algorithm randomly chooses basisSize number of centers {cl | cl = (ul, vl)}bl=1 from the input data.

  2. Depending on the input series we select one of the cases:

    • In case both series are continuous (useDelta flag should be false): as model candidates, we propose using a Gaussian kernel model:

      FA Gaussian Kernel

      where sigma will be optimally chosen from the candidates list sigmaList.

    • In case of one series (we consider it is the second one) is discrete φi,y should be a Kronecker delta (useDelta flag should be true): Dy,yi:

      FA Gaussian Kernel 2 Descrete

    • If both series are discrete, following approximation doesn’t make any sense, cause in this case mutual information can be found from formula:

      FA 2 Descrete

  3. In order to estimate model parameter - vector α, we need to maximize Likelihood function under constraint αi > 0, we obtain the following optimization problem:

    FA Alpha Estimate

    where lambda is Lagrange multiplier and

    FA Alpha EstimateH

    Differentiating the objective function with respect to α and equating it to zero, we can obtain an analytic-form solution:

    FA Analytic Solution

    where Ib is the b-dimensional identity matrix; and the value of lambda will be chosen optimally from the candidates list lambdaList. It will allow us to find solution very fast using Least Squares instead of complex constrained numerical optimization.

  4. Algorithm splits all data into basketCount baskets and performs Cross Validation procedure: estimates model parameters on data set with one basket excluded and validates solution fitness on the excluded basket. Model with best CV score will be chosen to calculate final MI value.

Implementation

Least Squares Mutual Information method is implemented by the LSMI class.

The following constructors create an instance of the class and calculate Mutual Information between series X and Y:

Constructor

Description

Performance

set series and other parameters

Calculate Mutual Information between series X and Y, set list of possible values for Lagrange multiplier and Gaussian standard deviation, whether Kronecker Delta function should be used, basket count and number of basis function to approximate density ratio.

lambdaList - candidates list from which value of lambda will be chosen optimally;

sigmaList - candidates list from which value of sigma will be chosen optimally;

useDelta - flag which should be set to true if the second series is discrete;

basketCount - number of baskets for cross validation procedure;

basisSize - number of basis functions for the model.

methodLSMI(Double, Double, Double, Double, Boolean, Int32, Int32)

set series and use default parameters

Calculate Mutual Information between series X and Y.

Default values:

lambdaList = vector of 9 logarithmically spaced points between 10 and 109

sigmaList = vector of 9 logarithmically spaced points between 10-2 and 102

useDelta = false

basketCount = 5

basisSize = min(200, input vectors length)

methodLSMI(Double, Double)

The class provides two methods:

Method

Description

Performance

out sample validate

Estimate goodness of model fitness on new samples.

methodOutSampleValidate(Double, Double)

log space generate

The logspace function generates logarithmically spaced vectors. Especially useful for creating frequency vectors. Generates n points between decades 10a and 10b.

methodLogSpace(Double, Double, Int32)

The class provides one property MutualInformation which is the approximation of Mutual Information between two series.

See Also