NOTE: The following materials are presented for timely dissemination of academic and technical work. Copyright and all other rights therein are reserved by authors and/or other copyright holders. Persoanl use of the following materials is permitted and, however, people using the materials or information are expected to adhere to the terms and constraints invoked by the related copyright.


Semi-supervised Learning via Regularized Boosting Working on Multiple Semi-supervised Assumptions


ABSTRACT

Semi-supervised learning concerns the problem of learning in the presence of labeled and unlabeled data. Several boosting algorithms have been extended to semi-supervised learning with various strategies. To our knowledge, however, none of them takes all three semi-supervised assumptions, i.e., smoothness, cluster and manifold assumptions, together into account during boosting learning. In this paper, we propose a novel cost functional consisting of the margin cost on labeled data and the regularization penalty on unlabeled data based on three fundamental semi-supervised assumptions. Thus, minimizing our proposed cost functional with a greedy yet stage-wise functional optimization procedure leads to a generic boosting framework for semi-supervised learning. Extensive experiments demonstrate that our algorithm yields favorite results for benchmark and real world classification tasks in comparison to state-of-the-art semi-supervised learning algorithms including newly developed boosting algorithms. Finally, we discuss relevant issues and relate our algorithm to the previous work.


Click tpami2011.pdf for full text.