SS, as provided by authors, are available via FTP from the GWAS Catalog. However, the files provided are highly heterogenous with respect to format and content. Organization of individual SS by study does not enable users to query across multiple studies, for example, to retrieve all P-values from a given genomic region associated with a particular trait is a common query which is not supported by this organizational model. To address the heterogenous data formatting and content, we analysed the variety of SS files provided to the Catalog. Files were commonly provided as a tab- or comma-delimited format, but column labels were poorly standardized. For example, ‘Chr’, ‘CHR’, ‘Chromosome’ columns were all used, whereas in other case the ‘LOC’ column would contain both the chromosome and base pair location separated by a colon. Sometimes variants were reported by rsID, sometimes by chromosome and base pair location referencing different genome builds, and sometimes by a combination of both. Therefore, we propose a standard set of fields and a standard format, and we have developed a harmonization and QC process in collaboration