Chunk #21 — Findings — Improvements in PLINK 1.9 — Memory efficiency

Source: Second-generation PLINK: rising to the challenge of larger and richer datasets.
Embedded: yes

Text

To make it possible for PLINK 1.9 to handle the huge datasets that benefit the most from these speed improvements, the program core no longer keeps the main genomic data matrix in memory; instead, most of its functions only load data for a single variant, or a small window of variants, at a time. Sample × sample matrix computations still normally require additional memory proportional to the square of the sample size, but –parallel gets around this: