This toolbox is aimed at people working on discrete datasets for classification. All functions expect discrete inputs. It provides implementations of Shannon's Information Theory functions and implementations of Renyi's Entropy and Alpha Divergence. Versions from 2.0 include weighted information theory functions based upon the work of S. Guiasu from "Information Theory with Applications" (1977). The toolbox was developed to support our research into feature selection algorithms and includes some sample feature selection algorithms from the literature to illustrate its use. Updated versions of the demonstration algorithms are provided (with many others) in the FEAST toolbox we developed to support our research. A Java version of MIToolbox was ported from the C code, and is available here.

MIToolbox works on discrete inputs, and all continuous values *must* be
discretised before use with MIToolbox. Real-valued inputs will be discretised
with x = floor(x) to ensure compatibility. MIToolbox produces unreliable
results when used with continuous inputs, runs slowly and uses much more memory
than usual. The discrete inputs should have small cardinality, MIToolbox will
treat values {1,10,100} the same way it treats {1,2,3} and the latter will be
both faster and use less memory. This limitation is due to the difficulties in
estimating information theoretic functions of continuous variables.

Note: all functions are calculated in log base 2, so return units of "bits".

- Calculating Entropy, H(X)
- Calculating Conditional Entropy, H(X|Y)
- Calculating Mutual Information, I(X;Y)
- Calculating Conditional Mutual Information, I(X;Y|Z)
- Generating a joint random variable
- Calculating Renyi's Alpha Entropy, H_{\alpha}(X)
- Calculating Renyi's Alpha Mutual Information, I_{\alpha}(X;Y)
- Calculating the Weighted Entropy, H_w(X)
- Calculating the Weighted Conditonal Entropy, H_w(X|Y)
- Calculating the Weighted Mutual Information, I_w(X;Y)

```
$ y = [1 1 1 0 0]';
```

$ x = [1 0 1 1 0]';

$ mi(x,y) %% mutual information I(X;Y)

ans =

0.0200

$ h(x) %% entropy H(X)

ans =

0.9710

$ condh(x,y) %% conditional entropy H(X|Y)

ans =

0.9510

$ h( [x,y] ) %% joint entropy H(X,Y)

ans =

1.9219

$ joint([x,y]) %% joint random variable XY

ans = [1,2,1,3,4]';

Also provided are example implementations of 3 feature selection algorithms (CMIM, DISR, mRMR-D) which use the functions provided by MIToolbox. These example algorithms are provided in two forms, one coded in MATLAB and one coded in C using the MATLAB mex interface. The library is written in ANSI C for compatibility with the MATLAB mex compiler.

All MIToolbox code is licensed under the 3-clause BSD license, except the feature selection algorithms which are provided as is, with no warranty, for demonstration purposes.

If you use this toolbox for academic research please cite as:

Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection

Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján

*Journal of Machine Learning Research (JMLR)*. Volume 13, Pages 27-66, 2012. Link.

- MATLAB/OCTAVE - run matlab/CompileMIToolbox.m
- Linux/Mac OSX C shared library - use the included makefile

MLOSS Project Page

I've hosted the source on GitHub here for anyone who wishes to browse through it. It will be updated approximately the same time as this page.

- 07/01/2017 - v3.0.0 - Refactored internals to expose integer information theoretic calculations.
- 10/01/2016 - v2.1.2 - Relicense from LGPL to BSD. Added checks to ensure input MATLAB types are doubles.
- 02/02/2015 - v2.1.1 - Fixed up the Makefile so it installs the headers too.
- 22/02/2014 - v2.1 - Fixed a couple of bugs related to memory handling. Added a make install for compatibility with PyFeast.
- 30/08/2012 - v2.00 - Released the weighted information theory functions.
- 08/11/2011 - v1.03 - Minor documentation changes to accompany the JMLR publication.
- 15/10/2010 - v1.02 - Fixed bug where MIToolbox would cause a segmentation fault if a x by 0 empty matrix was passed in. Now prints an error message and returns gracefully
- 02/09/2010 - v1.01 - Updated CMIM.m in demonstration_algorithms, due to a bug where the last feature would not be selected first if it had the highest MI
- 07/07/2010 - v1.00 - Initial Release