Comparison of strategies for selecting single nucleotide polymorphisms for case/control association studies

Abstract
It is widely believed that a subset of single nucleotide polymorphisms (SNPs) is able to capture the majority of the information for genotype-phenotype association studies that is contained in the complete compliment of genetic variations. The question remains, how does one select that particular subset of SNPs in order to maximize the power of detecting a significant association? In this study, we have used a simulation approach to compare three competing methods of site selection: random selection, selection based on pair-wise linkage disequilibrium, and selection based on maximizing haplotype diversity. The results indicate that site selection based on maximizing haplotype diversity is preferred over random selection and selection based on pair-wise linkage disequilibrium. The results also indicate that it is more prudent to increase the sample size to improve a study's power than to continuously increase the number of SNPs. These results have direct implications for designing gene-based and genome-wide association studies.