PipeOnline 2.0: automated EST processing and functional data sorting

Abstract
Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, un‐annotated, single‐pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA‐sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annota ted database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress‐genomics.org.