Coronavirus genome: prediction of putative functional domains in the non-structural polyprotein by comparative amino acid sequence analysis

Abstract
Amino acid sequencee of 2 giant non–structural polyproteins (Fl and F2) of infectious bronchitis virus (IBV), a member of Coronaviridae, were compared, by computer–assisted methods, to sequences of a number of other positive strand RNA viral and cellular proteins. By this approach, juxtaposed putative RNA-dependent RNA polymerase, nucleic acid binding “finger”-like) and RNA helicase domains were identified in F2. Together, these domains might constitute the core of the protein complex involved in the primer-dependent transcription, replication and recombination of coronaviruses. In Fl, two cysteine protease-like domains and a growth factor-like one were revealed. One of the putative proteases of IBV is similar to 3C proteases of picornoviruses and related enzymes of coro-, nepo- and potyviruses. Search of IBV Fl and F2 sequences for sites similar to those cleaved by the latter proteases and lntercomparison of the surrounding sequence stretches revealed 13 dipeptides O/S(G) which are probably cleaved by the coronavirus 3C-like protease. Based on these observations, a partial tentative schene for the functional organization and expression strategy of the non-structural polyproteins of IBV was proposed. It implies that, despite the general similarity to other positive strand RNA viruses, and particularly to potyviruaes, coronaviruses possess a number of unique structural and functional features.