Chunk #11 — METHODS — Regression model for signal adjustments

Source: Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms.
Embedded: yes

Text

We developed a simple statistical method to adjust signal intensity values at each marker for samples affected by genomic waves. Unlike ‘smoothing’ based regression methods that try to borrow information from neighboring markers in the adjustment, our method adjusts each marker separately regardless of the signal intensities at neighboring markers, therefore eliminating concerns on smoothing out true CNV boundaries. Suppose there are M (for example, M = ∼550K for Illumina HumanHap550 array) markers in a genotyped sample, we collect all the m autosome markers that are at least 1 Mb away from each other (for example, m = ∼3K for Illumina HumanHap550 array). This method reduces the number of response variables in regression model, and eliminates the potential dependence between markers due to colocalization in the same genomic region. For each of the m markers, we collect its LRR value as Lj (j = 1, …, m) and the average GC percentage in the 1 Mb window around the marker, then fit a linear regression model: where the model parameters α and β are estimated by the least-squares method. To