Hierarchical temporal video segmentation and content characterization

Abstract
This paper addresses the segmentation of a video sequence into shots, specification of edit effects and subsequent characterization of shots in terms of color and motion content. The proposed scheme uses DC images extracted from MPEG compressed video and performs an unsupervised clustering for the extraction of camera shots. The specification of edit effects, such as fade-in/out and dissolve is based on the analysis of distribution of mean value for the luminance components. This step is followed by the representation of visual content of temporal segments in terms of key frames selected by similarity analysis of mean color histograms. For characterization of the similar temporal segments, motion and color characteristics are classified into different categories using a set of different features derived from motion vectors of triangular meshes and mean histograms of video shots.