Learning Deep Architectures and Applications

IEEE-INNS IJCNN 2011 Tutorial, San Jose, California, U.S.A., July 31 2011

Deep architectures (DAs) refer to a family of computational models composed of multiple levels of nonlinear operations, e.g., neural networks or Bayesian belief networks of many hidden layers. Researches in neuroscience uncover that the brain extracts multiple levels of representations from sensory input to achieve its impressive performance in various perceptual tasks ranging from visual to speech perception. Motivated by the neuroscience researches, computational learning theories show that modeling complex behaviors, such as machine perception, requires highly varying nonlinear mathematical functions, which expresses the high-level abstraction with respect to sensory input. Furthermore, the recent theoretical outcome also suggests that existing shallow architectures, e.g., SVMs, suffer from theoretical limitations so that they are less competent for efficiently learning highly varying nonlinear functions demanded in modeling complex behaviors. As a result, DAs become an inevitable choice to develop complex intelligent systems. Learning DAs turns out to be a very difficult optimization task; in general, the traditional back-propagation algorithm based solely on supervised learning does not work well in training DAs. Recently, a breakthrough has been made in learning DAs with a hybrid learning framework and the use of DAs trained by hybrid learning algorithms leads to the state-of-the-art performance in several challenging machine perception tasks.

This tutorial would cover main topics in state-of-the-art DAs and their successful applications in a systematic way, which consists of four parts. Within the first part, the background and core concepts of DAs will be introduced and the theoretical justification of DAs will also be briefly discussed. The second part of this tutorial will overview typical building blocks of DAs, e.g., restricted Boltzmann machine (RBM) and de-noising auto-associator (DNAA), basic learning algorithms, e.g., divergence contrastive and stochastic gradient descent algorithms, for building blocks and the essential hybrid learning strategy for learning DAs by combining the local greedy layerwise unsupervised learning and global supervised learning via fine-tuning parameters. The third part will present a number of successful applications of DAs in machine perception, e.g., hand-written OCR recognition, object recognition and learning intrinsic speaker-specific characteristics. The last part will cover the latest progresses in DA researches as well as discuss relevant issues and open research topics.

In summary, this tutorial provides a deep understanding of the problems and the currently proposed solutions to DAs and their applications in real world problems. The tutorial would help postgraduate researchers and practitioners in neural computation, machine learning and pattern recognition gain the knowledge of DAs and the insight into the current state of DAs as well as their applications in real world problems.

Biosketch of the speaker

Ke Chen received BSc and MSc from Nanjing University in 1984 and 1987, respectively, and PhD from HIT in 1990, all in Computer Science. He has been with The University of Manchester since 2003. During 1990-2003, he was with The University of Birmingham, Peking University, The Ohio State University, Kyushu Institute of Technology, and Tsinghua University. He was a visiting professor at Microsoft Research Asia in 2000 and Hong Kong Polytechnic University in 2001. He has been on the editorial board of several academic international journals including IEEE Transactions on Neural Networks (2005-2010) and serves as the category editor of Pattern Recognition in Scholarpedia (2006-present). He has served as a technical program co-chair of several international conferences, e.g., IJCNN (2012) and ICNC (2005), and a member of numerous technical international conference program committees such as IJCNN and CogSci. In 2008 and 2009, he chaired Intelligent Systems Applications Technical Committee (ISATC) and University Curricula Subcommittee, IEEE Computational Intelligence Society (IEEE CIS). He also served as task force chairs and a member of several technical committees in IEEE CIS, e.g., NNTC, ETTC and DMTC. He is a senior member of IEEE, a member of IEEE CIS and a member of INNS. He was a recipient of several academic awards including the NSFC Distinguished Principal Young Investigator Award and JSPS Research Award. He has published over 100 academic papers in refereed journals and conferences and gave more than 30 invited talks in universities and research institutes across the world. His current research interests include neural computation, pattern recognition, machine learning, machine perception and applications to real world problems.

Updated: K. Chen 10-Jan-2011