Chunk #36 — Methods — Fitting a distribution to exposure data

Source: Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016.
Embedded: yes

Text

We used an ensemble technique in which a model selection algorithm is used to choose the best model for each risk factor.16 We drew the initial set of candidate models from commonly used PDF families. We fitted each PDF candidate family to each dataset using the MoM, and used the Kolmogorov-Smirnov test17 as the measure of GoF. Preliminary analysis showed that the GoF ranking of PDF families varied across datasets for any particular risk factor and that combining the predictions of differently fitted PDF families could dramatically improve the GoF for each dataset. Therefore, we developed a new model for prediction using the ensemble of candidate models, which is a weighted linear combination of all candidate models, {f}, where a set of weights {w} is chosen such that it is the sum of the weights equals to one and the values of the weights were determined by a second GoF criterion with its own validation process. Because of basic differences among risk factors, their distributions, and the risk attribution process, the model selection process was often slightly different for each