In order to vary nX, X values were randomly set to missing. IV strength in a given data set is measured by the F statistic from the first-stage regression of X on G. IVs with an average first-stage F value less than 10 are conventionally considered weak, although this threshold is arbitrary, and some bias persists even for nonweak IVs (9). F is defined as the ratio of the variance explained by the model to the residual variance in the model. F can be expressed as a function of the first-stage R2, the sample size (n), and the number of IVs (k): (4)