Coordinate descent is ideally set up to exploit such sparsity, in an obvious way. The O(N) inner-product operations in either the naive or covariance updates can exploit the sparsity, by summing over only the non-zero entries. Note that in this case scaling of the variables will not alter the sparsity, but centering will. So scaling is performed up front, but the centering is incorporated in the algorithm in an efficient and obvious manner.