Chunk #19 — Discussion

Source: GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing.
Embedded: yes

Text

The development of GAWMerge was done with TOPMed WGS and array genotyped-data, although it can be applied using any case-only array-genotyped data with other WGS data resources (e.g., UK BioBank16, Gabrielle Miller Kids First and/or GenomeAsia 100K17 data). To incorporate new data, it will be important to identify the phenotypic data which will be used to combine controls with available cases. For example, we selected controls based on the smoking status of the cohorts to minimize bias due to smoking. Additional phenotypic and clinical data, such as sex and age distributions, should be considered when selecting the most appropriate controls for combining with available cases. In this study we combined cases and controls with the same ancestry to minimize bias. Further work is needed to evaluate GAWMerge for mega analysis GWAS29. GAWMerge was developed with imputation using the thousand genomes reference population, although method can be applied using other reference populations, such as the TOPMed reference population on the Michigan Imputation Server19. Since TOPMed samples are used as controls in GAWMerge, there will be sample overlap between the input data