Chunk #42 — 4 Regularized Multinomial Regression — 4.1 Regularization and Parameter Ambiguity

Source: Regularization Paths for Generalized Linear Models via Coordinate Descent.
Embedded: yes

Text

As was pointed out earlier, if {β0ℓ,βℓ}1K characterizes a fitted model for (20), then {β0ℓ−c0,βℓ−c}1K gives an identical fit (c is a p-vector). Although this means that the log-likelihood part of ((21) is insensitive to (c0, c), the penalty is not. In particular, we can always improve an estimate {β0ℓ,βℓ}1K (w.r.t. (21)) by solving (27)minc∈ℝp∑ℓ=1KPα(βℓ−c).