Information content in the circular dichroism of proteins

Abstract
A method is presented for predicting the secondary structure of a protein from its circular dichroism (CD) spectrum. Eight types of secondary structure are considered: helix; parallel and antiparallel .beta. strand; types I, II and III .beta. turn; all other .beta. turns combined; and other structures. The method is based on mathematical calculation of orthogonal basis CD spectra from the CD spectra of proteins with known secondary structure. Five basis CD spectra are needed to reconstruct the 16 original protein CD spectra that extend into the vacuum UV region to 178 nm. Thus, one can expect to extract 5 independent pieces of information from the CD spectrum of a protein. Each basis CD spectrum corresponds to a known mixture of secondary structures so that the coefficients that reconstruct the protein CD spectrum can also be used to predict secondary structure. When the same method is applied to protein secondary structure rather than CD, only 5 basis secondary structure vectors are needed to reconstruct the original protein secondary structure vectors. Thus, there are 5 independent superstructures (consisting of a mixture of standard secondary structures) in the proteins studied. There is apparently enough information in the CD spectrum of a protein to predict all types of secondary structure. CD analyses compare favorably with the X-ray data.