While the input function for initial PheWAS has been individual SNPs, other possible inputs include all SNPs across a gene, or all variants in a pathway. Further, the input function need not be genetic: an investigator might ask what diseases are overrepresented after exposure to a drug, or search for pair-wise overlaps of genetically-defined heritability across all other conditions in the phenome.45 A topology-based network approach46 in 11,210 subjects with diabetes identified clinical characteristics and associated genotypes for three subsets: one with diabetic nephropathy and retinopathy; a second with cancer and cardiovascular diseases; and the third associated with cardiovascular disease, neurological disease, allergy, and HIV infection. These data support the long held clinical impression that “type II diabetes” is a label for a disease that has a range identifiable subsets and point to how high dimensional EHR – coupled eventually to genotype data to further refine subset definitions – may assist in identifying such subsets for individualized therapy.