Use of genomic history to improve phylogeny and understanding of births and deaths in a gene family

Abstract
Polyploidy events have played an important role in the evolution of angiosperm genomes. Here, we demonstrate how genomic histories can increase phylogenetic resolution in a gene family, specifically the expansin superfamily of cell wall proteins. There are 36 expansins in Arabidopsis and 58 in rice. Traditional sequence-based phylogenetic trees yield poor resolution below the family level. To improve upon these analyses, we searched for gene colinearity (microsynteny) between Arabidopsis and rice genomic segments containing expansin genes. Multiple rounds of genome duplication and extensive gene loss have obscured synteny. However, by simultaneously aligning groups of up to 10 potentially orthologous segments from the two species, we traced the history of 49 out of 63 expansin-containing segments back to the ancestor of monocots and eudicots. Our results indicate that this ancestor had 15-17 expansin genes, each ancestral to an extant clade. Some clades have strikingly different growth patterns in the rice and Arabidopsis lineages, with more than half of all rice expansins arising from two ancestral genes. Segmental duplications, most of them part of polyploidy events, account for 12 out of 21 new expansin genes in Arabidopsis and 16 out of 44 in rice. Tandem duplications explain most of the rest. We were also able to estimate a minimum of 28 gene deaths in the Arabidopsis lineage and nine in rice. This analysis greatly clarifies expansin evolution since the last common ancestor of monocots and eudicots and the method should be broadly applicable to many other gene families.