Abstract
This work presents a method to compare local clusters of interacting residues as observed in a known three-dimensional protein structure with corresponding clusters inferred from homologous protein sequences, assuming conserved protein folding. For this purpose the local environment of a selected residue in a known protein structure is defined as the ensemble of amino acids in contact with it in the folded state. Using a multiple sequence alignment to identify corresponding residues in homologous proteins, a detailed comparison can be performed between the local environment of a selected amino acid in the template protein structure and the expected local environments at the sets of equivalent residues, derived from the aligned protein sequences. The comparison makes it possible to detect conserved local features such as hydrogen bonding or complementarity in residue substitution. A global measure of environmental similarity is also defined, to search for conserved amino acid clusters subject to functional or struc tural constraints. The proposed approach is useful for investigating protein function as well as for site-directed mutagenesis experiments, where appropriate amino acid substitutions can be suggested by observing naturally occurring protein variants.