Operon prediction in Pyrococcus furiosus

Abstract
Identification of operons in the hyperthermophilic archaeon Pyrococcus furiosus represents an important step to understanding the regulatory mechanisms that enable the organism to adapt and thrive in extreme environments. We have predicted operons in P.furiosus by combining the results from three existing algorithms using a neural network (NN). These algorithms use intergenic distances, phylogenetic profiles, functional categories and gene-order conservation in their operon prediction. Our method takes as inputs the confidence scores of the three programs, and outputs a prediction of whether adjacent genes on the same strand belong to the same operon. In addition, we have applied Gene Ontology (GO) and KEGG pathway information to improve the accuracy of our algorithm. The parameters of this NN predictor are trained on a subset of all experimentally verified operon gene pairs of Bacillus subtilis . It subsequently achieved 86.5% prediction accuracy when applied to a subset of gene pairs for Escherichia coli , which is substantially better than any of the three prediction programs. Using this new algorithm, we predicted 470 operons in the P.furiosus genome. Of these, 349 were validated using DNA microarray data.