KC.4 Boosting on Manifolds via Adaptive Regularization of Base Learners

As a generic ensemble learning framework, boosting works by sequentially constructing a linear combination of base learners that concentrate on difficult examples. As one of the most successful machine learning algoirthms, adaptive boosting (Adaboost) algorithms have revolutionized pattern recognition technology for over a decade. Although such algorithms have been widely reported to have a property that it is relatively immune to overfitting, it is now known that noisy data can cause Adaboost algorithms to overfit, in particular, as many boosting steps take.

In order to remedy the overfitting problem, many efforts have been made to tolerate to outliers during boosting with different strategies. Recently, Kegl and Wang (2004) proposed a new regularization method by exploring and exploiting the manifold structural information underlying data from an alternative viewpoint. Through a specific form Adaboost, marginal Adaboost, their algorithm fulfills the adaptive selection of base learners based on the manifold structural information. In this project, the student will investigate their algorithm by a Matlab implementation. The algorithm will be applied to synthetic and benchmark data sets to evaluate its effectiveness. The deliverable of this project will be a demo system with an appropriate interface.

References: B. Kegl and L. Wang, "Boosting on manifolds: adaptive regularization of base classifiers " Advances in Neural Information Processing Systems (NIPS) 17, 2004. [on-line available] http://books.nips.cc/

Prerequisites: Good mathematics background is essential. The project will be carried out in Matlab. Some knowledge of machine learning would be an advantage (eg. COMP60431).