Estimating bacterial diversity from clone libraries with flat rank abundance distributions

Abstract
There are a number of parametric and non‐parametric methods for estimating diversity. However all such methods employ either the proportional abundance of the most abundant taxon in a sample or require that a specific taxon is sampled more than once. Consequently, the available methods for estimating diversity cannot be applied to samples consisting entirely of singletons, which might be characteristic of some hyperdiverse communities. Here we present a non‐parametric method that estimates the probability that a given number of unique taxa would be sampled from a community with a particular diversity. We have applied this approach to a well known data set of 100 unique clones from a sample of Amazonian soil (Borneman and Triplett (1997) Appl Environ Microbiol 63: 2647–2653) and determine the probability that this observation would be made from an environment of a given diversity. On this basis we can state this observation would be very unlikely (P = 0.006) if the soil diversity was less than 103, and quite unlikely (P = 0.6) if the diversity was less than 104, and probable (P = 0.95) if the diversity was about 105. There are essentially no contestable assumptions in our method. Thus we are able to offer almost unequivocal evidence that the bacterial diversity, of at least soils, is very large and a method that may be used to interpret samples consisting entirely of singletons from other hyperdiverse communities.