Chunk #12 — 2 Algorithms for the Lasso, Ridge Regression and the Elastic Net

Source: Regularization Paths for Generalized Linear Models via Coordinate Descent.
Embedded: yes

Text

Consider a coordinate descent step for solving (1). That is, suppose we have estimates β̃0 and β̃ℓ for ℓ ≠ j, and we wish to partially optimize with respect to βj. Denote by R(β0, β) the objective function in (1). We would like to compute the gradient at βj = β̃j, which only exists if β̃j ≠ 0. If β̃j > 0, then (4)∂R∂βj|β=β˜=−1N∑i=1Nxij(yi−β˜o−xiTβ˜)+λ(1−α)βj+λα.