The TIGR Plant Transcript Assemblies database

Open Access

6 November 2006

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 35 (Database), D846-D851
https://doi.org/10.1093/nar/gkl785

Abstract

The TIGR Plant Transcript Assemblies (TA) database (Author Webpage) uses expressed sequences collected from the NCBI GenBank Nucleotide database for the construction of transcript assemblies. The sequences collected include expressed sequence tags (ESTs) and full-length and partial cDNAs, but exclude computationally predicted gene sequences. The TA database includes all plant species for which more than 1000 EST or cDNA sequences are publicly available. The EST and cDNA sequences are first clustered based on an all-versus-all pairwise sequence comparison, followed by the generation of consensus sequences (TAs) from individual clusters. The clustering and assembly procedures use the TGICL tool, Megablast and the CAP3 assembler. The UniProt Reference Clusters (UniRef100) protein database is used as the reference database for the functional annotation of the assemblies. The transcription orientation of each TA is determined based on the orientation of the alignment with the best protein hit. The TA sequences and annotation are available via web interfaces and FTP downloads. Assemblies can be retrieved by a text-based keyword search or a sequence-based BLAST search. The current version of the TA database is Release 2 (July 17, 2006) and includes a total of 215 plant species.

Keywords

This publication has 13 references indexed in Scilit:

Comparative Plant Genomics Resources at PlantGDB
Plant Physiology, 2005
The Universal Protein Resource (UniProt)
Nucleic Acids Research, 2004
The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes
Nucleic Acids Research, 2004
PlantGDB, plant genome database and analysis tools
Nucleic Acids Research, 2004
TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets
Bioinformatics, 2003
Database resources of the National Center for Biotechnology
Nucleic Acids Research, 2003
BLAT—The BLAST-Like Alignment Tool
Genome Research, 2002
The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species
Nucleic Acids Research, 2001
A Greedy Algorithm for Aligning DNA Sequences
Journal of Computational Biology, 2000
CAP3: A DNA Sequence Assembly Program
Genome Research, 1999

Cited by 172 articles