Comparison of measures of marker informativeness for ancestry and admixture mapping

Open Access

20 December 2011

journal article
research article
Published by Springer Nature in BMC Genomics

Vol. 12 (1), 622
https://doi.org/10.1186/1471-2164-12-622

Abstract

Background: Admixture mapping is a powerful gene mapping approach for an admixed population formed from ancestral populations with different allele frequencies. The power of this method relies on the ability of ancestry informative markers (AIMs) to infer ancestry along the chromosomes of admixed individuals. In this study, more than one million SNPs from HapMap databases and simulated data have been interrogated in admixed populations using various measures of ancestry informativeness: Fisher Information Content (FIC), Shannon Information Content (SIC), F statistics (F_ST), Informativeness for Assignment Measure (I_n), and the Absolute Allele Frequency Differences (delta, δ). The objectives are to compare these measures of informativeness to select SNP markers for ancestry inference, and to determine the accuracy of AIM panels selected by each measure in estimating the contributions of the ancestors to the admixed population. Results: F_ST and I_n had the highest Spearman correlation and the best agreement as measured by Kappa statistics based on deciles. Although the different measures of marker informativeness performed comparably well, analyses based on the top 1 to 10% ranked informative markers of simulated data showed that I_n was better in estimating ancestry for an admixed population. Conclusions: Although millions of SNPs have been identified, only a small subset needs to be genotyped in order to accurately predict ancestry with a minimal error rate in a cost-effective manner. In this article, we compared various methods for selecting ancestry informative SNPs using simulations as well as SNP genotype data from samples of admixed populations and showed that the I_n measure estimates ancestry proportion (in an admixed population) with lower bias and mean square error.

Keywords

This publication has 48 references indexed in Scilit:

Population structure analysis using rare and common functional variants
BMC Proceedings, 2011
Interrogating local population structure for fine mapping in genome-wide association studies
Bioinformatics, 2010
Inferring Genetic Ancestry: Opportunities, Challenges, and Implications
American Journal of Human Genetics, 2010
Database mining for selection of SNP markers useful in admixture mapping
BioData Mining, 2009
Genome-wide distribution of ancestry in Mexican Americans
Human Genetics, 2008
Admixture mapping as a tool in gene discovery
Current Opinion in Genetics & Development, 2007
A Genomewide Admixture Mapping Panel for Hispanic/Latino Populations
American Journal of Human Genetics, 2007
A Genomewide Admixture Map for Latino Populations
American Journal of Human Genetics, 2007
Estimation of individual admixture: Analytical and study design considerations
Genetic Epidemiology, 2005
A Coefficient of Agreement for Nominal Scales
Educational and Psychological Measurement, 1960

Cited by 63 articles