Novel protein families in archaean genomes

Abstract
In a quest for novel functions in archaea, all archaean hypothetical open reading frames (ORFs), as annotated in the Swiss-Prot protein sequence database, were used to search the latest databases for the identification of characterized homoiogues. Of the 95 hypothetical archaean ORFs, 25 were found to be homologous to another hypothetical archaean ORF, while 36 were homologous to non-archaean proteins, of which as many as 30 were homologous to a characterized protein family. Thus the level of sequence similarity in thIs set reaches 64%, while the level of function assignment is only 32%. Of the ORFs with predicted functIons, 12 homologies are reported here for the first time and represent nine new functions and one gene duplication at an acetyl-coA synthetase locus. The novel functions include components of the transcriptional and transla tional apparatus, such as ribosomal proteins, modifica tion enzymes and a translation initiation factor. In addition, new enzymes are Identified in archaea, such as cobyric acid synthase, dCTP deaminase and the first archaean homologues of a new subclass of ATP binding proteins found in fungi. Finally, ft is shown that the putative iaminin receptor family of eukaryotes and an archaean homologue belong to the previously characterized ribosomal protein family S2 from eubac terla. From the present and previous work, the major implication Is that archaea seem to have a mode of expression of genetic information rather similar to eukaryotes, while eubacterla may have proceeded Into unique ways of transcription and translation, in addi tion, with the detection of proteins in various metabolic and genetic processes In archaea, we can further predict the presence of additional proteins involved in these processes.