Chunk #38 — Methods — Data pre-processing

Source: Functional mapping and annotation of genetic associations with FUMA.
Embedded: yes

Text

All genetic data sets used in this study are based on the hg19 human assembly and rsIDs were mapped to dbSNP build 146 if necessary. To compute minor allele frequencies and LD structure, we used the data from the 1000 Genomes Project27 phase 3 (1000G). Minor allele frequency and r 2 of pairwise SNPs (minimum r 2 = 0.05 and maximum distance between a pair of SNPs is 1 Mb) were pre-computed using PLINK26 for each of available populations (AFR, AMR, EAS, EUR, and SAS). Functional annotations of SNPs were obtained from the following three repositories; CADD13, RegulomeDB14, and core 15-state model of chromatin9,10,15. Cis-eQTL information was obtained from the following four different data repositories; GTEx portal v68, Blood eQTL browser16, BIOS QTL Browser17, and BRAINEAC18, and genes were mapped to ensemble gene ID if necessary (Supplementary Note 2). Pre-processed Hi-C data for 14 tissue types and seven cell lines were obtained from GSE8711211 (Supplementary Note 3). Predicted enhancer and promoter regions for 111 epigenomes were obtained from the Roadmap Epigenomics Projects10. Genomic coordinate of GWAS catalog1 reported SNPs was