paperKB
coga / coga-kb
Help
Sign in

Chunk #46 — Findings — Improvements in PLINK 1.9 — Newly integrated third-party software — Multithreaded gzip

Source
Second-generation PLINK: rising to the challenge of larger and richer datasets.
Embedded
yes

Text

For many purposes, compressed text files strike a good balance between ease of interpretation, loading speed, and resource consumption. However, the computational cost of generating them is fairly high; it is not uncommon for data compression to take longer than all other operations combined. To make a dent in this bottleneck, we have written a simple multithreaded compression library function based on Mark Adler’s excellent pigz program [34], and routed most of PLINK 1.9’s gzipping through it. See parallel_compress() in pigz.c for details.