As was pointed out earlier, if {β0ℓ,βℓ}1K characterizes a fitted model for (20), then {β0ℓ−c0,βℓ−c}1K gives an identical fit (c is a p-vector). Although this means that the log-likelihood part of ((21) is insensitive to (c0, c), the penalty is not. In particular, we can always improve an estimate {β0ℓ,βℓ}1K (w.r.t. (21)) by solving (27)minc∈ℝp∑ℓ=1KPα(βℓ−c).