Chunk #18 — 3 Results — 3.3 Genome partition of common genetic variants — 3.3.2 Partitioning the genome into genic and intergenic regions

Source: Exploring the genetic architecture of alcohol dependence in African-Americans via analysis of a genomewide set of common variants.
Embedded: yes

Text

Based on information available from the UCSC Genome Browser hg19 assembly [18], we used ANNOVAR [30] to annotate all the SNPs. We mapped all SNPs to the following four regions: exon, intron, intergenic, and other regions (e.g., downstream, upstream, splicing). For simplicity, we then partitioned the entire genome into genic and intergenic regions. With d as the smallest distance between a SNP and all genes, a SNP was assigned to the intergenic region if d ≥ τ (the distance threshold). For example, when τ = 10 kb, a SNP is not within 10 kb of any gene and would be assigned to the intergenic region. This resulted in 9% of SNPs in the intergenic region being assigned to the genic region. The partitioning shown in the left panel of Figure 3 corresponds to τ = 10 kb. Clearly, more SNPs were assigned to the genic region as the threshold τ increased. In the following analysis, we increased τ from 0 kb to 50 kb.