We generated eight and seven iPSC lines from 22q11.2 patients and healthy individuals, respectively, and obtained RNA-seq data from a total of 19 differentiating neuron samples (seven controls with two duplicates for a total of nine samples; eight 22q11.2 DS SZ and SAD patients with two duplicates for a total of ten samples). The number of reads obtained from the RNA-seq runs for each of the 19 samples and the fraction that could be aligned to the human genome were comparable (Additional file 5). A total of 14,549 transcripts were expressed in our samples, including 12,981 protein-coding and 512 lincRNAs (long intergenic non-coding RNAs). Clustering analysis of the samples based on their raw FPKMs yielded two groups, which likely reflects heterogeneity in neural differentiation (see Methods for details). We then applied a batch correction method to account for the expression variation and used the corrected expression values in the software DESeq2 [52] to determine differential gene expression. The entire gene list with corrected expression values can be found on Additional file 6. After filtering out low expressed genes (mean FPKM