A Catalog of Structural and Gene Copy Number Variations of Cultivated Rice

Abstract
The extent of structural variants (SVs), in particular copy number variations (CNVs) in plant and animal genomes remains unknown, mainly due to the lack of population-scale truly high-quality genomes. Here, we de novo assembled and annotated 31 gold-standard reference genomes from varieties representing all major cultivated rice subtypes, and accurately detected 107,251 non-redundant SVs affecting 644.91Mbp. A position-resolved pan-genome comprising 66,636 genes enabled our discovery that more than 38% of protein-coding genes in cultivated rice harbor CNVs, a far greater proportion than previous estimate. Illustrating functional consequences of these variations, CNVs of Awn3-1, OsVIL1 and OsMADS18, as well as a 345kb inversion likely contributed to major events in rice evolution, domestication and environmental adaptation. Beyond fully resolving the SVs of cultivated rice and comprehensively cataloguing CNVs among protein-coding genes, our study suggests that SVs-affected genes likely contribute to many mechanisms underlying domestication and phenotypic variation in rice.