paperKB
coga / coga-kb
Help
Sign in

Chunk #5 — Methods — Definitions of Quality Measures

Source
Quality control and quality assurance in genotypic data for genome-wide association studies.
Embedded
yes

Text

Missing call rate is either the fraction of missing calls per SNP over samples or the fraction per sample over SNPs. On a scatterplot of the normalized allelic probe intensities produced by SNP assays, θ is defined to be the polar coordinate angle of a point (i.e. a sample-SNP combination) and R is defined to be the sum of those intensities. BAlleleFreq (BAF) is a measure of allelic imbalance, defined as an estimate of the allelic frequency in a population of cells from a single individual. LogRRatio (LRR) is a measure of relative intensity, the logarithm (base 2) of the observed value of R divided by the expected value [Peiffer, et al. 2006]. The genotype cluster plot of a SNP displays either the intensities of the two alleles or R versus θ and the genotype calls of each sample. The confidence score is a measure related to the distance between a given data point and the centroid of the nearest genotype cluster in a cluster plot. The heterozygosity of a sample is the fraction of non-missing genotype calls that are