The TIGR Plant Transcript Assemblies database
Open Access
- 6 November 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (Database), D846-D851
- https://doi.org/10.1093/nar/gkl785
Abstract
The TIGR Plant Transcript Assemblies (TA) database (Author Webpage) uses expressed sequences collected from the NCBI GenBank Nucleotide database for the construction of transcript assemblies. The sequences collected include expressed sequence tags (ESTs) and full-length and partial cDNAs, but exclude computationally predicted gene sequences. The TA database includes all plant species for which more than 1000 EST or cDNA sequences are publicly available. The EST and cDNA sequences are first clustered based on an all-versus-all pairwise sequence comparison, followed by the generation of consensus sequences (TAs) from individual clusters. The clustering and assembly procedures use the TGICL tool, Megablast and the CAP3 assembler. The UniProt Reference Clusters (UniRef100) protein database is used as the reference database for the functional annotation of the assemblies. The transcription orientation of each TA is determined based on the orientation of the alignment with the best protein hit. The TA sequences and annotation are available via web interfaces and FTP downloads. Assemblies can be retrieved by a text-based keyword search or a sequence-based BLAST search. The current version of the TA database is Release 2 (July 17, 2006) and includes a total of 215 plant species.Keywords
This publication has 13 references indexed in Scilit:
- Comparative Plant Genomics Resources at PlantGDBPlant Physiology, 2005
- The Universal Protein Resource (UniProt)Nucleic Acids Research, 2004
- The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomesNucleic Acids Research, 2004
- PlantGDB, plant genome database and analysis toolsNucleic Acids Research, 2004
- TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasetsBioinformatics, 2003
- Database resources of the National Center for BiotechnologyNucleic Acids Research, 2003
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic speciesNucleic Acids Research, 2001
- A Greedy Algorithm for Aligning DNA SequencesJournal of Computational Biology, 2000
- CAP3: A DNA Sequence Assembly ProgramGenome Research, 1999