where var(YQ, j) is the variance of the genes within the gene cluster in sample j (both equations, ref. [26]). ZE(j) has an approximate Student's t null distribution and larger values indicate tighter co-regulation of clustered genes within sample j. Once ZE(j) has been computed, an incremental approach is used to compute the correlation coefficient, s.g.i., with a scoring function that minimizes the number of nonquery genes scoring higher than those in the query. The final s.g.i., which is proportional to the Euclidean distance, is then based on the most informative experiments.