Population genetic implications from DNA polymorphism in random human genomic sequences

Abstract
Denaturing high performance liquid chromatography (DHPLC) in combination with dye‐terminator sequencing was used to survey 516 random genomic sequence tagged sites (STSs) for biallelic polymorphisms in 24 representatives of the major ethnic groups residing in the United States. Of the 301 polymorphic STSs (58.3%), 172 contained a single simple sequence polymorphism (SSP), while 78, 35, and 16 contained 2, 3, and 4–6 SSPs, respectively. Of the 541 SSPs identified, 342 (63%), 152 (28%), and 47 (9%) were transitions, transversions, and insertions or deletions, respectively. Only 21% of the STSs contained SSPs with a minor‐allele frequency >20%. The nucleotide diversity estimate for random genomic sequences ) was on average 50% higher than that for intragenic non‐coding regions of the human genome ( ). The discrepancy in Tajima's D statistic between 22 autosomal genes (D=−1.304±0.622, mean±SD) and random STSs (D=−0.27) suggests that, in the absence of significant mutation rate heterogeneity, the more negative values for genes are a consequence of directional selection rather than population growth. Hum Mutat 20:209–217, 2002.