Semi-supervised Node Splitting for Random Forest Construction

1 June 2013

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 20 (10636919), 492-499
https://doi.org/10.1109/cvpr.2013.70

Abstract

Node splitting is an important issue in Random Forest but robust splitting requires a large number of training samples. Existing solutions fail to properly partition the feature space if there are insufficient training data. In this paper, we present semi-supervised splitting to overcome this limitation by splitting nodes with the guidance of both labeled and unlabeled data. In particular, we derive a nonparametric algorithm to obtain an accurate quality measure of splitting by incorporating abundant unlabeled data. To avoid the curse of dimensionality, we project the data points from the original high-dimensional feature space onto a low-dimensional subspace before estimation. A unified optimization framework is proposed to select a coupled pair of subspace and separating hyper plane such that the smoothness of the subspace and the quality of the splitting are guaranteed simultaneously. The proposed algorithm is compared with state-of-the-art supervised and semi-supervised algorithms for typical computer vision applications such as object categorization and image segmentation. Experimental results on publicly available datasets demonstrate the superiority of our method.

Keywords

This publication has 15 references indexed in Scilit:

A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011
Semi-Supervised Random Forests
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
The random subspace method for constructing decision forests
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998
Induction of decision trees
Machine Learning, 1986
ISOETRP—an interactive clustering algorithm with new objectives
Pattern Recognition, 1984
A combined nonparametric approach to feature selection and binary decision tree design
Pattern Recognition, 1980
An Algorithm for Constructing Optimal Binary Decision Trees
IEEE Transactions on Computers, 1977
Constructing optimal binary decision trees is NP-complete
Information Processing Letters, 1976
Density Estimation for Statistics and Data Analysis
Published by Springer Nature ,1400

Cited by 64 articles