paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #32 — Methods — Genotyping (‘gtarray’) — Genotype calling and imputation

Source
Common genetic variation drives molecular heterogeneity in human iPSCs.
Embedded
yes

Text

After primary quality control, the Genotyping (GT) module of the GenomeStudio software (Illumina, CA, USA) was used to call the genotypes. For each probe, the GT module estimates the Log R ratio and B-allele frequency for each sample using a clustering model applied to the distribution of signal intensities. These statistics are used internally by GenomeStudio to assign the sample genotypes for each marker. Variant coverage was further increased using statistical imputation and phasing. We constructed a reference panel of haplotypes from a combination of SNPs and small insertions and deletions (indels) in the UK10K cohorts and 1000 Genomes Phase 1 data 51,52. Samples were independently imputed using IMPUTE2 v2.3.1 53 and subsequently phased using SHAPEIT v2.r790 54. This analysis was done in chunks of average 5 Mb, with 300 Kb buffer regions on each side. IMPUTE2 was used with its default MCMC options (-Ne 20000 -k 80) for autosomes and -Ne 15000 -k 100 for X chromosome. SHAPEIT was run without MCMC iteration (-no-mcmc) so that each sample was phased independently using the reference panel as the haplotype scaffold