Tracing Cattle Breeds with Principal Components Analysis Ancestry Informative SNPs
Open Access
- 7 April 2011
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 6 (4), e18007
- https://doi.org/10.1371/journal.pone.0018007
Abstract
The recent release of the Bovine HapMap dataset represents the most detailed survey of bovine genetic diversity to date, providing an important resource for the design and development of livestock production. We studied this dataset, comprising more than 30,000 Single Nucleotide Polymorphisms (SNPs) for 19 breeds (13 taurine, three zebu, and three hybrid breeds), seeking to identify small panels of genetic markers that can be used to trace the breed of unknown cattle samples. Taking advantage of the power of Principal Components Analysis and algorithms that we have recently described for the selection of Ancestry Informative Markers from genomewide datasets, we present a decision-tree which can be used to accurately infer the origin of individual cattle. In doing so, we present a thorough examination of population genetic structure in modern bovine breeds. Performing extensive cross-validation experiments, we demonstrate that 250-500 carefully selected SNPs suffice in order to achieve close to 100% prediction accuracy of individual ancestry, when this particular set of 19 breeds is considered. Our methods, coupled with the dense genotypic data that is becoming increasingly available, have the potential to become a valuable tool and have considerable impact in worldwide livestock production. They can be used to inform the design of studies of the genetic basis of economically important traits in cattle, as well as breeding programs and efforts to conserve biodiversity. Furthermore, the SNPs that we have identified can provide a reliable solution for the traceability of breed-specific branded products.Keywords
This publication has 39 references indexed in Scilit:
- Genome-wide Insights into the Patterns and Determinants of Fine-Scale Population Structure in HumansAmerican Journal of Human Genetics, 2009
- High-resolution haplotype block structure in the cattle genomeBMC Genomic Data, 2009
- The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and EvolutionScience, 2009
- Genome-Wide Survey of SNP Variation Uncovers the Genetic Structure of Cattle BreedsScience, 2009
- Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discoveryGenome Biology, 2009
- A whole-genome assembly of the domestic cow, Bos taurusGenome Biology, 2009
- Unlocking the bovine genomeBMC Genomics, 2009
- An assessment of population structure in eight breeds of cattle using a whole genome SNP panelBMC Genomic Data, 2008
- Worldwide Human Relationships Inferred from Genome-Wide Patterns of VariationScience, 2008
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006