Incorporating prior information in machine learning by creating virtual examples

Abstract
One of the key problems in supervised learning is the insufficient size of the training set. The natural way for an intelligent learner to counter this problem and successfully generalize is to exploit prior information that may be available about the domain or that can be learned from prototypical examples. We discuss the notion of using prior knowledge by creating virtual examples and thereby expanding the effective training-set size. We show that in some contexts this idea is mathematically equivalent to incorporating the prior knowledge as a regularizer, suggesting that the strategy is well motivated. The process of creating virtual examples in real-world pattern recognition tasks is highly nontrivial. We provide demonstrative examples from object recognition and speech recognition to illustrate the idea.