Analysis of symbolic sequences using the Jensen-Shannon divergence
Top Cited Papers
- 25 March 2002
- journal article
- research article
- Published by American Physical Society (APS) in Physical Review E
- Vol. 65 (4), 041905
- https://doi.org/10.1103/physreve.65.041905
Abstract
We study statistical properties of the Jensen-Shannon divergence D, which quantifies the difference between probability distributions, and which has been widely applied to analyses of symbolic sequences. We present three interpretations of D in the framework of statistical physics, information theory, and mathematical statistics, and obtain approximations of the mean, the variance, and the probability distribution of D in random, uncorrelated sequences. We present a segmentation method based on D that is able to segment a nonstationary symbolic sequence into stationary subsequences, and apply this method to DNA sequences, which are known to be nonstationary on a wide range of different length scales.Keywords
This publication has 36 references indexed in Scilit:
- New Stopping Criteria for Segmenting DNA SequencesPhysical Review Letters, 2001
- Finding Borders between Coding and Noncoding DNA Regions by an Entropic Segmentation MethodPhysical Review Letters, 2000
- Statistical characterization of the mobility edge of vibrational states in disordered materialsPhysical Review B, 1999
- Sequence Compositional Complexity of DNA through an Entropic Segmentation MethodPhysical Review Letters, 1998
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Correlations in DNA sequences: The role of protein coding segmentsPhysical Review E, 1997
- Mosaic organization of DNA nucleotidesPhysical Review E, 1994
- Divergence measures based on the Shannon entropyIEEE Transactions on Information Theory, 1991
- Mutual information functions versus correlation functionsJournal of Statistical Physics, 1990
- On the convexity of some divergence measures based on entropy functionsIEEE Transactions on Information Theory, 1982