The family members are then compared at each sequence position, and the identity of a position is kept only when at least 70% of the members have the same sequence at that position. Positions that cannot form a consensus are replaced by an ‘N’ and are considered undefined. The resulting data are referred to as SSCSs (Supplementary Fig. 2). We have experimented with requiring more than three members per family or >70% sequence agreement, but we have found that this reduces data yield without any appreciable change in the method’s accuracy (Supplementary Fig. 3).