Abstract
The pace of technical advancement in microbial genomics has been breathtaking. Since 1995, when the first complete genome sequence of a free-living organism, Haemophilus influenzae, was published,1 1554 complete bacterial genome sequences (the majority of which are from pathogens) and 112 complete archaeal genome sequences have been determined, and more than 4800 and 90, respectively, are in progress.2 A total of 41 complete eukaryotic genome sequences have been determined (19 from fungi), and more than 1100 are in progress. Complete reference genome sequences are available for 2675 viral species, and for some of these species, a large number of strains have been completely sequenced. Nearly 40,000 strains of influenza virus3 and more than 300,000 strains of human immunodeficiency virus (HIV) type 1 have been partially sequenced.4 However, the selection of microbes and viruses for genome sequencing is heavily biased toward the tiny minority that are amenable to cultivation in the laboratory, numerically dominant in particular habitats of interest (e.g., the human body), and associated with disease.