Abstract
The problem is raised of how best to evaluate the significance of data with reference to the question of whether a difference in given conditions, A versus B, has influenced the frequency (p) of the events under consideration (e.g., mutations, crossovers, etc.), when the data are in the form of several or many experiments or series of observations (1,2,3 .... k), in which the numbers of observations (n) taken under the two conditions vary irregularly from expt. to expt., there being no fixed ratio of nA to nB. It is shown that under these circumstances the frequencies found under conditions A and B, respectively, may be averaged by weighting the frequencies found in each series (p1A .... pkA and PkA .... PkB, respectively) by the harmonic means (nh1 . . . . nhk) of the numbers of observations (n1A and n1B . . . . nkA and nkB) found under the two conditions in each series (nh1 = 2 n1A n1B/(n1A + n1B). That is, [mage]A = [SIGMA][mage](nhl p1A)/[SIGMA][mage]. Then the difference, d, between the averages, [mage]A and [mage]B, has the following standard error: e[mage]=[SIGMA][mage](2 n1A P1 q1)/[SIGMA][image](nh1). Here p1 represents (P1A + P1B/(n1A + n1B), where P1A and P1B are the absolute numbers of "events" found, and one[long dash]P1=q1, which is frequently so near one as to be negligible or representable by an average value of q. The significance of the difference in the results under the two conditions may be gauged from the value of [image] (that is, x); this may be derived more directly by the formula [SIGMA][image](Dh1)/[SIGMA][image](2 nhl p1 q1); here Dh1 is used to represent nh1 (p1A [long dash]p1B).