Abstract
Many statistical $M$-estimators are based on convex optimization problems formed by the combination of a data-dependent loss function with a norm-based regularizer. We analyze the convergence rates of projected gradient and composite gradient methods for solving such problems, working within a high-dimensional framework that allows the ambient dimension $d$ to grow with (and possibly exceed) the sample size $n$. Our theory identifies conditions under which projected gradient descent enjoys globally linear convergence up to the statistical precision of the model, meaning the typical distance between the true unknown parameter $\theta^{*}$ and an optimal solution $\widehat{\theta}$. By establishing these conditions with high probability for numerous statistical models, our analysis applies to a wide range of $M$-estimators, including sparse linear regression using Lasso; group Lasso for block sparsity; log-linear models with regularization; low-rank matrix recovery using nuclear norm regularization; and matrix decomposition using a combination of the nuclear and $\ell_{1}$ norms. Overall, our analysis reveals interesting connections between statistical and computational efficiency in high-dimensional estimation.
Citation
Alekh Agarwal. Sahand Negahban. Martin J. Wainwright. "Fast global convergence of gradient methods for high-dimensional statistical recovery." Ann. Statist. 40 (5) 2452 - 2482, October 2012. https://doi.org/10.1214/12-AOS1032
Information