paperKB
coga / coga-kb
Help
Sign in

Chunk #37 — Online Methods — Subjects and biological sampling

Source
Heritability and genomics of gene expression in peripheral blood.
Embedded
yes

Text

The large sample size enabled additional QC metrics involving inter-sample comparisons. First, samples showing sex inconsistency were removed (based on chrX and chrY probesets). Second, we examined the pairwise correlation matrix of expression profiles. Using rij as the correlation between arrays i and j, we computed ri¯=∑jrij/n, the average correlation of array i with all others of the total n arrays. Lower r̄i corresponds to lower quality, and were expressed in terms of median absolute deviations Di=(r¯i-r¯¯)/median(∣r¯i-r¯¯∣) to provide a sense of distance from the grand correlation mean r¯¯. Third, we verified sample identity between U219 gene expression and Affymetrix 6.0 genotypes (see below), having previously discovered up to 5% genotype-expression mismatch rates in published eQTL studies. 19 Briefly, 500 of the most significant SNP-transcript local eQTL pairs 19 were used to estimate a posterior probability for a match between gene expression and genotype profiles (similar to reference 79). This approach identified sex-mismatched samples and additional samples of poor quality.