Do baseline P-values follow a uniform distribution in randomised trials?
- Authors
- Bland, Martin
- Year
- 2013
- Journal
- PloS one
- PMID
- 24098419
- DOI
- 10.1371/journal.pone.0076010
- PMCID
- PMC3788030
BACKGROUND: The theory has been put forward that if a null hypothesis is true, P-values should follow a Uniform distribution. This can be used to check the validity of randomisation. METHOD: The theory was tested by simulation for two sample t tests for data from a Normal distribution and a Lognormal distribution, for two sample t tests which are not independent, and for chi-squared and Fisher's exact test using small and using large samples. RESULTS: For the two sample t test with Normal data the distribution of P-values was very close to the Uniform. When using Lognormal data this was no longer true, and the distribution had a pronounced mode. For correlated tests, even using data from a Normal distribution, the distribution of P-values varied from simulation run to simulation run, but did not look close to Uniform in any realisation. For binary data in a small sample, only a few probabilities were possible and distribution was very uneven. With a sample of two groups of 1,000 observations, there was great unevenness in the histogram and a poor fit to the Uniform. CONCLUSIONS: The notion that P-values for comparisons of groups using baseline data in randomised clinical trials should follow a Uniform distribution if the randomisation is valid has been found to be true only in the context of independent variables which follow a Normal distribution, not for Lognormal data, correlated variables, or binary data using either chi-squared or Fisher's exact tests. This should not be used as a check for valid randomisation.
Distribution of P-values for 10,000 two sample t tests for Normal data.Means were compared between two groups of 10 observations from a Standard Normal distribution.
Distribution of P-values for 10,000 two sample t tests for Lognormal data.Means were compared between two groups of 10 observations from a Lognormal distribution.
P-values from four realisations of 10,000 correlated t tests for Normal data.Means were compared between two groups of 10 observations from a Standard Normal distribution where each test used variables with correlation 0.5 with the other variables.
Distribution of P-values for chi-squared tests and Fisher’s exact tests for two by two tables.Chi-squared and Fisher’s exact test, both two-sided and one-sided, were calculated for the comparison of two samples of size 10 with a binary outcome variable with probability 0.5 of being 0 and 0.5 of being 1.
P-values for chi-squared and Fisher’s exact tests for large samples with fixed size groups.Chi-squared and Fisher’s exact test, both two-sided and one-sided, were calculated for10,000 comparisons of two samples of size 1,000 with a binary outcome variable with probability 0.5 of being 0 and 0.5 of being 1.
| Name | Type |
|---|---|
| age | phenotype |
| baseline variable local | phenotype |
| Central Allocation Centres local | cohort |
| clinical trial | cohort |
| clinical trials | cohort |
| Local Allocation Centres local | cohort |
| meta-analysis | cohort |
| Normal distribution local | drug |
| Pocock et al. study local | cohort |
| P-value | drug |
| Randomised Clinical Trials local | cohort |
| Randomised groups local | cohort |
| Skewed distribution local | drug |
| Two-sample t test local | drug |
| Uniform distribution local | drug |
No uploaded files.
| Citation | PMID | DOI | Status |
|---|---|---|---|
| AltmanDG (1985) Comparability of randomised groups. Statistician 34: 125–136. | — | — | — |
| AltmanDG, DoréCJ (1990) Randomisation and Baseline Comparisons in Clinical Trials. Lancet 335: 149–153.196744110.1016/0140-6736(90)90014-v | — | — | — |
| BergerVW (2010) Testing for baseline balance: Can we finally get it right? J Clin Epidemiol 63: 939–940.2045692010.1016/j.jclinepi.2010.02.014PMC2904824 | — | — | — |
| BergerVW, ExnerDV (1999) Detecting selection bias in randomized clinical trials. Control Clin Trials 20: 319–327.1044055910.1016/s0197-2456(99)00014-8 | — | — | — |
| GardnerMJ, AltmanDG (1986) Confidence intervals rather than P values: estimation rather than hypothesis testing. BMJ 292: 746–50.308242210.1136/bmj.292.6522.746PMC1339793 | — | — | — |
| KennedyA, GrantA (1997) Subversion of allocation in a randomised controlled trial. Control Clin Trials 18: S77–S78. | — | — | — |
| PocockSJ, AssmannSE, EnosLE (2002) KastenLE (2002) Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems. Stat Med 21: 2917–2930.1232510810.1002/sim.1296 | — | — | — |
| RobertsC, TorgersonDJ (1999) Understanding controlled trials: Baseline imbalance in randomised controlled trials. BMJ 319: 185.1040676310.1136/bmj.319.7203.185PMC1116277 | — | — | — |
| SchulzKF, ChalmersI, GrimesDA, AltmanDG (1994) Assessing the quality of randomization from reports of controlled trials published in obstetrics and gynecology journals. JAMA 272: 125–128.8015122 | — | — | — |
| SennS (1994) Testing for baseline balance in clinical trials. Stat Med 13: 1715–1726.799770510.1002/sim.4780131703 | — | — | — |
| The CONSORT statement. Available http://www.consort-statement.org/consort-statement/. Accessed 2013 Sept. 10. | — | — | — |
In this knowledge base
| Title | Year | PMID |
|---|---|---|
| A Brief Critique of the TATES Procedure. | 2018 | 29468442 |
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Response: Integrity of randomized clinical trials: Performance of integrity tests and checklists requires assessment. | Chien PFW | — | 2023 | → |
| Automated detection of over- and under-dispersion in baseline tables in randomised controlled trials. | Barnett A | — | 2022 | → |
| Parasites make hosts more profitable but less available to predators | Prosnier L et al. | — | 2022 | — |
| Methods to assess research misconduct in health-related research: A scoping review. | Bordewijk EM et al. | — | 2021 | → |
| Diagnosing fraudulent baseline data in clinical trials. | Proschan MA et al. | — | 2020 | → |
| Effects of physical activity interventions on cognitive outcomes and academic performance in adolescents and young adults: A meta-analysis. | Haverkamp BF et al. | — | 2020 | → |
| Reconceptualizing the <i>p</i>-value from a likelihood ratio test: a probabilistic pairwise comparison of models based on Kullback-Leibler discrepancy measures. | Riedle B et al. | — | 2020 | → |
| Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables. | Bolland MJ et al. | — | 2019 | → |
| No evidence for a bilingual executive function advantage in the nationally representative ABCD study. | Dick AS et al. | — | 2019 | → |
| Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing. | Mou T et al. | — | 2019 | → |
| Rounding, but not randomization method, non-normality, or correlation, affected baseline P-value distributions in randomized trials. | Bolland MJ et al. | — | 2019 | → |
| A Brief Critique of the TATES Procedure. | Aliev F et al. | — | 2018 | → |
| Deviations from Expectations: A Commentary on Aliev et al. | van der Sluis S et al. | — | 2018 | → |
| Network-Based Approaches for Pathway Level Analysis. | Nguyen T et al. | — | 2018 | → |
| Statistical Approach for Gene Set Analysis with Trait Specific Quantitative Trait Loci. | Das S et al. | — | 2018 | → |
| Correlation among baseline variables yields non-uniformity of p-values. | Betensky RA et al. | — | 2017 | → |
| DANUBE: Data-driven meta-ANalysis using UnBiased Empirical distributions-applied to biological pathway analysis. | Nguyen T et al. | — | 2017 | → |
| The distribution of P-values in medical research articles suggested selective reporting associated with statistical significance. | Perneger TV et al. | — | 2017 | → |
| Conducting Meta-Analyses Based on p Values: Reservations and Recommendations for Applying p-Uniform and p-Curve. | van Aert RC et al. | — | 2016 | → |
| Social relationships and cognitive decline: a systematic review and meta-analysis of longitudinal cohort studies. | Kuiper JS et al. | — | 2016 | → |
| The Statistical Value of Raw Fluorescence Signal in Luminex xMAP Based Multiplex Immunoassays. | Breen EJ et al. | — | 2016 | → |
| The distribution of probability values in medical abstracts: an observational study. | Ginsel B et al. | — | 2015 | → |
| A methodological review of recent meta-analyses has found significant heterogeneity in age between randomized groups. | Clark L et al. | — | 2014 | → |