The estimation of statistical parameters for local alignment score distributions
- 15 January 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 29 (2), 351-361
- https://doi.org/10.1093/nar/29.2.351
Abstract
The distribution of optimal local alignment scores of random sequences plays a vital role in evaluating the statistical significance of sequence alignments. These scores can be well described by an extreme-value distribution. The distribution's parameters depend upon the scoring system employed and the random letter frequencies; in general they cannot be derived analytically, but must be estimated by curve fitting. For obtaining accurate parameter estimates, a form of the recently described 'island' method has several advantages. We describe this method in detail, and use it to investigate the functional dependence of these parameters on finite-length edge effects.Keywords
This publication has 24 references indexed in Scilit:
- Accurate formula for P-values of gapped local sequence and profile alignmentsJournal of Molecular Biology, 2000
- Empirical statistical estimates for sequence similarity searchesJournal of Molecular Biology, 1998
- [27] Local alignment statisticsMethods in Enzymology, 1996
- Identification of protein coding regions by database similarity searchNature Genetics, 1993
- Distribution of glutamine and asparagine residues and their near neighbors in peptides and proteins.Proceedings of the National Academy of Sciences, 1991
- Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.Proceedings of the National Academy of Sciences, 1990
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, 1988
- The significance of protein sequence similaritiesBioinformatics, 1988
- Optimal alignments in linear spaceBioinformatics, 1988
- An improved algorithm for matching biological sequencesJournal of Molecular Biology, 1982