paperKB
coga / coga-kb
Help
Sign in

Chunk #8 — Review — Limitations of existing reference sets

Source
Context and the human microbiome.
Embedded
yes

Text

Although the HMP generated an incredible volume of data, numerous design, technical, and access decisions affecting the HMP dataset have made reuse challenging. For instance, the decision to sample a few people extensively rather than a large number of people minimally (i.e., a cross-sectional study design) led to observation of only a small fraction of the diversity present with the population [28] and resulted in small sample sizes for different stratifications in the dataset [36], effectively removing the potential to observe demographic or regional differences. The choice to sequence multiple loci within the 16S rRNA gene resulted in data that are impractical to combine due to technical bias as amplification performance differs between primers [31, 37]. Furthermore, because the study design was not sufficient to elucidate the effect of employing multiple sequencing centers (which has been observed in other contexts; see the Microbiome Quality Control Project (MBQC) [38]), this issue must still be actively evaluated to assess the potential for technical biases. Host information, such as age and sex, are nearly prohibitive to access, requiring dean-level signatures for each individual