This toolbox is aimed at people working on discrete datasets for classification. All functions expect discrete inputs. It provides implementations of Shannon's Information Theory functions and implementations of Renyi's Entropy and Alpha Divergence. Version 2.0 includes weighted information theory functions based upon the work of S. Guiasu from "Information Theory with Applications" (1977). The toolbox was developed to support our research into feature selection algorithms and includes some sample feature selection algorithms from the literature to illustrate its use. Updated versions of the demonstration algorithms are provided (with many others) in the FEAST toolbox we developed to support our research. A Java version of MIToolbox was ported from the C code, and is available here.
Note: all functions are calculated in log base 2, so return units of "bits".
$ y = [1 1 1 0 0]';
$ x = [1 0 1 1 0]';
$ mi(x,y) %% mutual information I(X;Y)
$ h(x) %% entropy H(X)
$ condh(x,y) %% conditional entropy H(X|Y)
$ h( [x,y] ) %% joint entropy H(X,Y)
$ joint([x,y]) %% joint random variable XY
ans = [1,2,1,3,4]';
Also provided are example implementations of 3 feature selection algorithms (CMIM, DISR, mRMR-D) which use the functions provided by MIToolbox. These example algorithms are provided in two forms, one coded in MATLAB and one coded in C using the MATLAB mex interface. The library is written in ANSI C for compatibility with the MATLAB mex compiler.
All MIToolbox code is licensed under the 3-clause BSD license, except the feature selection algorithms which are provided as is, with no warranty, for demonstration purposes.
If you use this toolbox for academic research please cite as:
Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection
Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján
Journal of Machine Learning Research (JMLR). Volume 13, Pages 27-66, 2012. Link.
MATLAB/OCTAVE - run CompileMIToolbox.m, Linux C shared library - use the included makefile
I've hosted the source on GitHub here for anyone who wishes to browse through it. It will be updated approximately the same time as this page.