paperKB
coga / coga-kb
Help
Sign in

Chunk #19 — Methods — Encode Data

Source
A groupwise association test for rare mutations using a weighted sum statistic.
Embedded
yes

Text

To evaluate the weighted-sum method on rare variants with the frequency-spectrum of a naturally occurring population, we used resequencing data from the Encode III project (ftp://ftp.hgsc.bcm.tmc.edu/pub/data/Encode). In the Encode III project ten 100 kb Encode regions were resequenced in different human populations, and all substitutions were identified (see http://www.hgsc.bcm.tmc.edu/projects/human/). To mimic a disease-resequencing study, we grouped all exonic variants of each Encode region, and compared the number of rare variants between the two largest populations: the African YRI population (120 individuals; including 60 individuals from HapMap phase I and II) and the Central European CEU population (119 individuals; including 60 individuals from HapMap phase I and II). Only variants that passed the quality control filter for the ENCODE III study were used (see http://www.hgsc.bcm.tmc.edu/projects/human/). The genotype data were downloaded as the ENCODE III draft release I (on August 11th, 2008), and the “Gencode Ref (encodeGencodeGeneKnownMar07)” track in the UCSC Genome Browser [37] was used to define exon positions in each ENCODE region. Exonic variations were reported for only five of the ten ENCODE regions, and hence only these five regions were used.