Abstract
The large-scale genome sequencing projects present tremendous new opportunities for structural biology and molecular biophysics. This explosion of biological information provides novel insights into molecular evolution and molecular genetics, new reagents for molecular biology, and exciting new avenues for molecular medicine. However, to fully realize the value of these genetic blueprints, further investment is required to characterize the biological functions and three-dimensional structures of the corresponding gene products. These efforts, broadly characterized as functional and structural genomics, have the potential to provide a unified understanding of molecular biology from atomic to cellular levels. During the last few years, several international efforts have been initiated with the common goal of genomic-scale three-dimensional (3D) protein structure determination (for a summary of international structural genomics centers and consortia, see http://www.rcsb.org/pdb/strucgen.html#Worldwide). Driven by the availability of many complete genome sequences, recent technological advances in rapid 3D structure analysis (1–5), and the integrative thinking of bioinformatics (6–10), these efforts aim to provide a coarse sampling of the space of 3D protein structures. Clustering proteins into homologous sequence families, it has been estimated that high-resolution structure determinations of some 15,000–20,000 carefully selected proteins will enable accurate modeling of hundreds of thousands of protein structures (10). As well as being useful in their own right, such models can provide the basis for rapid analysis of x-ray crystallographic or NMR data, facilitating experimental high-resolution structure determinations. A recent issue of PNAS includes a report (11) from the New York Structural Genomics Research Consortium (NYSGRC) describing the x-ray crystal structures of two proteins involved in sterol/isoprenoid biosynthesis and the amplification of these structural data by homology modeling. This study is particularly noteworthy as a model of the kinds of information and analyses that will be available as recently funded structural genomics centers and consortia around …