We identified the single-nucleotide substitutions and small indels variants in the 1,739 samples using the reference model (gVCF-based) workflow for joint analysis in GATK60. Variants were called individually in each sample using the HaplotypeCaller in ‘-ERC GVCF’ mode to produce a record of genotype likelihoods and annotations at each site in the genome. Multi-allelic variants are reported in the GenomeAsia browser but were not included in the analysis. A gVCF file was created for every sample, and a subsequent joint genotyping analysis of all gVCFs was done to identify the variants in the cohort. We followed the GATK-recommended best practices for variant recalibration to create a final VCF file and recalibrated the variants to select 99% of the true sites from the training set for VQSR61. The VCF files were zipped using bgzip and indexed using tabix.