This toolbox is aimed at people working on discrete datasets for classification. All functions expect discrete inputs. It provides implementations of Shannon's Information Theory functions and implementations of Renyi's Entropy and Alpha Divergence. Version 2.0 includes weighted information theory functions based upon the work of S. Guiasu from "Information Theory with Applications" (1977). The toolbox was developed to support our research into feature selection algorithms and includes some sample feature selection algorithms from the literature to illustrate its use. Updated versions of the demonstration algorithms are provided (with many others) in the FEAST toolbox we developed to support our research. A Java version of MIToolbox was ported from the C code, and is available here.

Note: all functions are calculated in log base 2, so return units of "bits".

- Calculating Entropy, H(X)
- Calculating Conditional Entropy, H(X|Y)
- Calculating Mutual Information, I(X;Y)
- Calculating Conditional Mutual Information, I(X;Y|Z)
- Generating a joint random variable
- Calculating Renyi's Alpha Entropy, H_{\alpha}(X)
- Calculating Renyi's Alpha Mutual Information, I_{\alpha}(X;Y)
- Calculating the Weighted Entropy, H_w(X)
- Calculating the Weighted Conditonal Entropy, H_w(X|Y)
- Calculating the Weighted Mutual Information, I_w(X;Y)

```
$ y = [1 1 1 0 0]';
```

$ x = [1 0 1 1 0]';

$ mi(x,y) %% mutual information I(X;Y)

ans =

0.0200

$ h(x) %% entropy H(X)

ans =

0.9710

$ condh(x,y) %% conditional entropy H(X|Y)

ans =

0.9510

$ h( [x,y] ) %% joint entropy H(X,Y)

ans =

1.9219

$ joint([x,y]) %% joint random variable XY

ans = [1,2,1,3,4]';

Also provided are example implementations of 3 feature selection algorithms (CMIM, DISR, mRMR-D) which use the functions provided by MIToolbox. These example algorithms are provided in two forms, one coded in MATLAB and one coded in C using the MATLAB mex interface. The library is written in ANSI C for compatibility with the MATLAB mex compiler.

All MIToolbox code is licensed under the LGPL v3, except the feature selection algorithms which are provided as is, with no warranty, for demonstration purposes.

If you use this toolbox for academic research please cite as:

Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection

Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján

*Journal of Machine Learning Research (JMLR)*. Volume 13, Pages 27-66, 2012. Link.

MATLAB/OCTAVE - run CompileMIToolbox.m, Linux C shared library - use the included makefile

MLOSS Project Page

I've hosted the source on GitHub here for anyone who wishes to browse through it. It will be updated approximately the same time as this page.

- 30/08/2012 - v2.00 - Released the weighted information theory functions.
- 08/11/2011 - v1.03 - Minor documentation changes to accompany the JMLR publication.
- 15/10/2010 - v1.02 - Fixed bug where MIToolbox would cause a segmentation fault if a x by 0 empty matrix was passed in. Now prints an error message and returns gracefully
- 02/09/2010 - v1.01 - Updated CMIM.m in demonstration_algorithms, due to a bug where the last feature would not be selected first if it had the highest MI
- 07/07/2010 - v1.00 - Initial Release