Sharing Visual Features for Multiclass and Multiview Object Detection
Top Cited Papers
- 19 March 2007
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence
- Vol. 29 (5), 854-869
- https://doi.org/10.1109/tpami.2007.1055
Abstract
We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (runtime) computational complexity and the (training-time) sample complexity scale linearly with the number of classes to be detected. We present a multitask learning procedure, based on boosted decision stumps, that reduces the computational and sample complexity by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required and, therefore, the runtime cost of the classifier, is observed to scale approximately logarithmically with the number of classes. The features selected by joint training are generic edge-like features, whereas the features chosen by training each class separately tend to be more object-specific. The generic features generalize better and considerably reduce the computational cost of multiclass object detectionKeywords
This publication has 27 references indexed in Scilit:
- LabelMe: A Database and Web-Based Tool for Image AnnotationInternational Journal of Computer Vision, 2007
- Pictorial Structures for Object RecognitionInternational Journal of Computer Vision, 2005
- Learning to detect objects in images via a sparse, part-based representationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Robust Real-Time Face DetectionInternational Journal of Computer Vision, 2004
- Greedy function approximation: A gradient boosting machine.The Annals of Statistics, 2001
- Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)The Annals of Statistics, 2000
- Classification by pairwise couplingThe Annals of Statistics, 1998
- Gradient-based learning applied to document recognitionProceedings of the IEEE, 1998
- Computation of Orientational Filters for Real-Time Computer Vision Problems I: Implementation and MethodologyReal-Time Imaging, 1995
- Visual learning and recognition of 3-d objects from appearanceInternational Journal of Computer Vision, 1995