In total, we identified 153 de novo coding variants in the TSAICG data. For each identified de novo variant, a Python script developed by Samocha and colleagues (DeNovoFinder) (De Rubeis et al., 2014) estimated both the relative probability of a true de novo event versus an inherited variant, and the likelihood of validation given a variety of quality control metrics. This script estimates the relative probability of true de novo (p_dn) based on the genotype likelihood of all trio members (PL, normalized Phred-scaled likelihoods), the allele frequency, and the average mutation rate per genome. We assigned one of three levels of validation likelihood (high, medium, and low) to each de novo variant as a result of combination of the relative probability of true de novo (p_dn), allele balance and read depth of all trio members, and allele frequency. Samocha and colleagues’ work shows that the de novo SNV and indel variants with high likelihood of validation have validation rate 97.3% and 92.3%, respectively. Therefore, we carried out validation on all low de novo variants and a subset of medium and