Chunk #30 — Materials and Methods — Inference of Common Deletion for Reference Sample

Source: Haplotypes with copy number and single nucleotide polymorphisms in CYP2A6 locus are associated with smoking quantity in a Japanese population.
Embedded: yes

Text

We used the 89 Japanese subjects from the 1000 Genomes Project as a reference, because the deletion frequency at the CYP2A6 locus may vary between different ethnic groups [14]. The common deletion polymorphism for the reference sample was inferred, and the breakpoints were estimated for the commonly deleted region. We assessed the “depth of coverage” for each subject, which was obtained from the alignment read data available from the 1000 Genomes Project. The depth of coverage for each subject was extracted using GATK software [30] and normalized to the mean coverage on chromosome 19 (Figure S6). The normalized depth in the 41.35–41.38 Mb region on chromosome 19 was used to calculate singular values that reflect the absolute numbers of copies for the individuals at the region (Figure S9). Then, a standard Gaussian mixture model was fitted to the singular values, and the copy number was inferred for each individual (a similar method was established previously [31]). The accuracy of the results was evaluated using the TaqMan assay, and the concordance rate for the 45 (out of 89) subjects tested was 100% (Figure S9).