Correlations between Similar Sets of Measurements

Abstract
The correlation between, for instance, father-and-son pairs with respect to one variable (e.g., height) is easily calculated. When we consider p variables simultaneously, let X1, ... ,Xp be a father''s measurements, and let Y1, ... ,Yp be his son''s measurements of the same p variables. The problem is to find the father-son correlation with respect to the p variables as a whole. If U = [alpha]1X1 + ... + [alpha]pXp and V = [alpha]1Y1 + ... + [alpha]pYp are "linear characteristics", the problem then reduces to finding a set of coefficients ([alpha]1, ... , [alpha]p) that maximizes the correlation between U and V. The present problem is thus a restricted case of the general problem of finding the maximum correlation between two sets of variables linearly combined as U = [alpha]1X1 + ... + [alpha]pXp and V = b1Y1 + ... + bqYq. In the restricted case, let [SIGMA]XX be tne p x p covariance matrix of X and [SIGMA]YY be "le pXp covariance matrix of Y; these two matrices are symmetric and positive definite, and we assume [SIGMA]XX = [SIGMA]YY = S. Next, let [SIGMA]XY be tne p X p matrix of covariance of each Xi with each Yi and construct the symmetric matrix T = 1/2([SIGMA]XY + [SIGMA]XY). Then the correlation between U and V for a particular set of coefficients is p(a) = a[image]Ta/a[image]Sa, which is to be maximized with respect to a = ([alpha]1, ... ,[alpha]p)[image]. The solution yields a set of canonical correlations p1 [greater than or equal to] P2 [greater than or equal to]... Pp and corresponding pairs of canonical variates (a[image]1X, a[image]1Y), (a[image]2X, a[image]2Y) ..., (a[image]pX, a[image]pY). It is shown that the desired coefficients are eigenvectors corresponding to the eigenvalues [lambda]1[greater than or equal to] [lambda]2[greater than or equal to] ...[greater than or equal to] [lambda][rho] of the matrix S-1T. Maximum likelihood estimates of the elements of the matrices S and T are given when [SIGMA]XY = [SIGMA]XY These turn out to be simply the familiar estimates of the elements of an arbitrary covariance matrix, averaged whenever necessary to obtain the hypothesized symmetry. A numerical example with p = 2 variables has been worked out.