Constructing Multigenome Views of Whole Microbial Genomes
- 1 January 1998
- journal article
- research article
- Published by Mary Ann Liebert Inc in Microbial & Comparative Genomics
- Vol. 3 (3), 177-192
- https://doi.org/10.1089/omi.1.1998.3.177
Abstract
We have designed and implemented a system to carry out cross-genome comparisons of open reading frames (ORFs) from multiple genomes. This implementation includes a genome profiling system that allows us to explore pairwise comparisons at different levels of match similarity and ask biologically motivated queries involving number and identity of ORFs, their function, functional category, distribution in genomes or in biological domains, and statistics on their matches and match families. This analysis required precise definition of new classification terms and concepts. We define the terms genomic signature, summary signature, biologic domain signature, domain class, match level, match family, and extended match family, then use these terms to define concepts, including genomically universal proteins and proteins characteristic of sets of genomes. We initiate an analysis based on automated FASTA (Pearson, 1996) comparison of 22,419 conceptually translated protein sequences from nine microbial genomes.Keywords
This publication has 34 references indexed in Scilit:
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- The first genome from the third domain of lifeNature, 1997
- Aminoacyl-tRNA synthesis: divergent routes to a common goalTrends in Biochemical Sciences, 1997
- A fosmid-based genomic map and identification of 474 genes of the hyperthermophilic archaeon Pyrobaculum aerophilumExtremophiles, 1997
- Organizational characteristics and information content of an archaeal genome: 156kb of sequence from Sulfolobus solfataricus P2Molecular Microbiology, 1996
- The emergence of major cellular processes in evolutionFEBS Letters, 1996
- Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coliCurrent Biology, 1996
- Fully automated genome analysis that reflects user needs and preferences. A detailed introduction to the MAGPIE system architectureBiochimie, 1996
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding RegionsDNA Research, 1996
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994