L. Bo, X. Ren, and D. Fox.

Hierarchical Matching Pursuit for Recognition: Architecture and Fast Algorithms

Advances in Neural Information Processing Systems (NIPS) 2011



Extracting good representations from images is essential for many computer vi- sion tasks. In this paper, we propose hierarchical matching pursuit (HMP), which builds a feature hierarchy layer-by-layer using an efficient matching pursuit en- coder. It includes three modules: batch (tree) orthogonal matching pursuit, spatial pyramid max pooling, and contrast normalization. We investigate the architecture of HMP, and show that all three components are critical for good performance. To speed up the orthogonal matching pursuit, we propose a batch tree orthog- onal matching pursuit that is particularly suitable to encode a large number of observations that share the same large dictionary. HMP is scalable and can effi- ciently handle full-size images. In addition, HMP enables linear support vector machines (SVMs) to match the performance of nonlinear SVMs while being scal- able to large datasets. We compare HMP with many state-of-the-art algorithms including convolutional deep belief networks, SIFT based single layer sparse cod- ing, and kernel based feature learning. HMP consistently yields superior accuracy on three types of visual recognition problems: object recognition (Caltech-101), scene recognition (MIT-Scene), and static event recognition (UIUC-Sports).


Full paper [pdf] (9 pages)


[To the RSE-lab]