Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes

Abstract
Introns are among the hallmarks of eukaryotic genes. Splicing of introns is directed by three main splicing signals: the 5′ splice site (5′ss), the branch site (BS), and the polypyrimdine tract/3′splice site (PPT-3′ss). To study the evolution of these splicing signals, we have conducted a systematic comparative analysis of these signals in over 1.2 million introns from 22 eukaryotes. Our analyses suggest that all these signals have dramatically evolved: The PPT is weak among most fungi, intermediate in plants and protozoans, and strongest in metazoans. Within metazoans it shows a gradual strengthening from Caenorhabditis elegans to human. The 5′ss and the BS were found to be degenerate among most organisms, but highly conserved among some fungi. A maximum parsimony-based algorithm for reconstructing ancestral position-specific scoring matrices suggested that the ancestral 5′ss and BS were degenerate, as in metazoans. To shed light on the evolutionary variation in splicing signals, we have analyzed the evolutionary changes in the factors that bind these signals. Our analysis reveals coevolution of splicing signals and their corresponding splicing factors: The strength of the PPT is correlated to changes in key residues in its corresponding splicing factor U2AF2; limited correlation was found between changes in the 5′ss and U1 snRNA that binds it; but not between the BS and U2 snRNA. Thus, although the basic ability of eukaryotes to splice introns has remained conserved throughout evolution, the splicing signals and their corresponding splicing factors have considerably evolved, uniquely shaping the splicing mechanisms of different organisms.