Abstract
Microsatellite markers are widely used for genetic studies, but the relationship between microsatellite slippage mutation rate and the number of repeat units remains unclear. In this study, microsatellite distributions in the human genome are collected from public sequence databases. We observe that there is a threshold size for slippage mutations. We consider a model of microsatellite mutation consisting of point mutations and single stepwise slippage mutations. From two sets of equations based on two stochastic processes and equilibrium assumptions, we estimate microsatellite slippage mutation rates without assuming any relationship between microsatellite slippage mutation rate and the number of repeat units. We use the least squares method with constraints to estimate expansion and contraction mutation rates. The estimated slippage mutation rate increases exponentially as the number of repeat units increases. When slippage mutations happen, expansion occurs more frequently for short microsatellites and contraction occurs more frequently for long microsatellites. Our results agree with the length-dependent mutation pattern observed from experimental data, and they explain the scarcity of long microsatellites.