NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes
Top Cited Papers
Open Access
- 1 July 2006
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 34 (Web Server), W394-W399
- https://doi.org/10.1093/nar/gkl244
Abstract
Microbiologists conducting surveys of bacterial and archaeal diversity often require comparative alignments of thousands of 16S rRNA genes collected from a sample. The computational resources and bioinformatics expertise required to construct such an alignment has inhibited high-throughput analysis. It was hypothesized that an online tool could be developed to efficiently align thousands of 16S rRNA genes via the NAST (Nearest Alignment Space Termination) algorithm for creating multiple sequence alignments (MSA). The tool was implemented with a web-interface at http://greengenes.lbl.gov/NAST. Each user-submitted sequence is compared with Greengenes' 'Core Set', comprising approximately 10,000 aligned non-chimeric sequences representative of the currently recognized diversity among bacteria and archaea. User sequences are oriented and paired with their closest match in the Core Set to serve as a template for inserting gap characters. Non-16S data (sequence from vector or surrounding genomic regions) are conveniently removed in the returned alignment. From the resulting MSA, distance matrices can be calculated for diversity estimates and organisms can be classified by taxonomy. The ability to align and categorize large sequence sets using a simple interface has enabled researchers with various experience levels to obtain bacterial and archaeal community profiles.Keywords
This publication has 22 references indexed in Scilit:
- Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARBApplied and Environmental Microbiology, 2006
- Bellerophon: a program to detect chimeric sequences in multiple sequence alignmentsBioinformatics, 2004
- ARB: a software environment for sequence dataNucleic Acids Research, 2004
- MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignmentBriefings in Bioinformatics, 2004
- Physiological and Community Responses of EstablishedGrassland Bacterial Populations to WaterStressApplied and Environmental Microbiology, 2003
- Comprehensive aligned sequence construction for automated design of effective probes (CASCADE-P) using 16S rDNABioinformatics, 2003
- Diversity and distribution of a deeply branched novel proteobacterial group found in anaerobic–aerobic activated sludge processesEnvironmental Microbiology, 2002
- The Bioperl Toolkit: Perl Modules for the Life SciencesGenome Research, 2002
- Counting the Uncountable: Statistical Approaches to Estimating Microbial DiversityApplied and Environmental Microbiology, 2001
- Conservation of primary structure in 16S ribosomal RNANature, 1975