Abstract
Detecting and extracting commercial breaks from a TV program is important for achieving efficient video storage and transmission. In this work , we approach this problem by utilizing both visual and audio information. Commercial breaks have several special characteristics such as a restricted temporal length, a high cut frequency, a high level of actions, delimiting black frames and silences, etc, which can be used for their separation from regular TV programs. A feature-based commercial break detection system is thus proposed to fulfill this task. We first perform a coarse-level detection of commercial breaks with pure visual information, since the high activity and the high cut frequency will somehow manifest themselves in the statistics of some measurable features. At the second step, we proceed to refine detected break boundaries by integrating audio clues. That is, there is always a short period of silence between commercial breaks and the TV program. Two audio features, i.e. the short- time energy and short-time average zero-crossing rate, are extracted for the silence detection purpose. At the last step, we return to the visual information domain again to achieve a frame-wise precision by locating the black frames. Extensive experiments show that by combining both visual and audio information, we can obtain accurate commercial break results.© (2000) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.