Compact units in proteins

Abstract
An explicit measure of geometric compactness called the coefficient of compactness is introduced. This single value figure of merit identifies those continuous segments of the polypeptide chain having the smallest solvent-accessible surface area for their volume. These segments are the most compact units of the protein, and the larger ones correspond to conventional protein domains. To demonstrate the plausability of this approach as a method of identifying protein domains, the measure is applied to lysozyme and ribonuclease to discover their constituent compact units. These units are then compared with domains, subdomains, and modules found by other methods. To show the sensitivity of the method, the measure is subdomains, and modules found by other methods. To show the sensitivity, the measure is used to successfully differentiate between native and deliberately misfolded proteins [Novotny, J., Bruccoleri, R., and Karplus, M. (1984) J. Mol. Biol. 177, 787-818]. Methods that utilize only backbone atoms to define domains cannot distinguish between authenic and misfolded molecules because their backbone conformations are virtually superimposable. Compact units identified by this method exhibit a hierachic organization. Such an organization suggests folding pathways that can be tested experimentally.