paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #55 — STAR★METHODS — METHOD DETAILS — Burden Analyses — Definition of Coding Portion of Exome

Source
De Novo Coding Variants Are Strongly Associated with Tourette Disorder.
Embedded
yes

Text

We defined the coding portion of the RefSeq hg19 by restricting to coding exons only (i.e., excluding UTRs, etc) in the RefSeq hg19 gene definitions (downloaded from the UCSC Genome Browser, “Table Browser” tool), and then merging all overlapping or book-ended intervals with bedtools sort and then bedtools merge (Quinlan and Hall, 2010). We then calculated the combined size by summing the intervals with awk. The resulting “coding exome” is 33,828,798 bp. The final bedfile, along with commands to create it and to calculate the total size are available on bitbucket (https://bitbucket.org/willseylab/tourette_phase1).