Chunk #12 — Methods — Data

Source: Environmental risk score as a new tool to examine multi-pollutants in epidemiologic research: an example from the NHANES study using serum lipid levels.
Embedded: yes

Text

above the four outcome variables included total cholesterol, HDL, LDL and triglycerides. Important covariates were chosen a priori and included age, sex, race/ethnicity (Mexican American, Other Hispanic, non-Hispanic white, non-Hispanic black, Other), education (categorized to less than high school diploma, high school diploma, and greater than high school diploma), BMI, and NHANES cycle. We selected education as an indicator of socioeconomic status because it is widely used and has less missing data than other proxies, such as household income or poverty income ratio. We also considered 21 blood measures of micronutrients (vitamins and isoflavone compounds), some of which were identified to predict serum lipids in the previous EWAS [6]. We imputed our data with a sequential imputation strategy using IVEWARE where the variables to be imputed were treated as the outcomes and all other variables were used as predictors [34], [35]. Since we used the data solely for an illustrative purpose, we used only one imputed dataset. The distributions of the data before and after imputation were similar (see File S1 for more details). The sample sizes after imputation were 10818 for the stage 1 sample and 4615 for the stage 2 sample. We applied logarithmic transformation with base 10