Intron–Exon Organization and Phylogeny in a Large Superfamily, the Paralogous Cytochrome P450 Genes of Arabidopsis thaliana

Abstract
The cytochrome P450 gene superfamily is represented by 80 genes in animal genomes and perhaps more than 300 genes in plant genomes. We analyzed about half of all Arabidopsis P450 genes, a very large dataset of truly paralogous genes. Sequence alignments were used to draw phylogenetic trees, and this information was compared with the inton–exon organization of each P450 gene. We found 60 unique intron positions, of which 37 were phase 0 introns. Our results confirm the polyphyletic origin of plant P450 genes. One group of these genes, the A-type P450s, are plant specific and characterized by a simple organization, with one highly conserved intron. Closely related A-type P450 genes are often clustered in the genome with as many as a dozen genes (e.g., of the CYP71 subfamily) on a short stretch of chromosome. The other P450 genes (non-A-type) form several distinct clades and are characterized by numerous introns. One such clade contains the two CYP51 genes, which are thought to encode obtusifoliol 14 alpha demethylase. The two CYP51 genes have a single intron that is not shared with CYP51 genes from vertebrates or fungi, or with any other Arabidopsis P450 gene. Only a few of the Arabidopsis P450 genes are intronless (e.g., the CYP710A and CYP96A subfamilies). There was a relatively good correlation between intron conservation and phylogenetic relationships between members of the P450 subfamilies. Gene organization appears to be a useful tool in establishing the evolutionary relatedness of P450 genes, which may help in predictions of P450 function.