As has been done previously in ASD (Sanders et al., 2012), we estimated the number of risk genes (C) based on the framework developed for the ‘unseen’ species problem. This estimate requires four parameters: (1) number of risk associated variants (d), (2) total number of observed risk genes (c), (3) number of genes mutated once (c1), and (4) probability that newly added variant hits a previously mutated gene (u).