First we imputed missing phenotype values using probabilistic principal components analysis [37], [38] with five principal components, using the R [39] package ppca [40]. This approach exploits correlations among phenotypes to impute missing values. In the final dataset 98.74% of phenotype values were observed and 1.26% were imputed.