Electronic Journal of Statistics

Performance bounds for parameter estimates of high-dimensional linear models with correlated errors

Wei-Biao Wu and Ying Nian Wu

Full-text: Open access

Abstract

This paper develops a systematic theory for high-dimensional linear models with dependent errors and/or dependent covariates. To study properties of estimates of the regression parameters, we adopt the framework of functional dependence measures ([43]). For the covariates two schemes are addressed: the random design and the deterministic design. For the former we apply the constrained $\ell_{1}$ minimization approach, while for the latter the Lasso estimation procedure is used. We provide a detailed characterization on how the error rates of the estimates depend on the moment conditions that control the tail behaviors, the dependencies of the underlying processes that generate the errors and the covariates, the dimension and the sample size. Our theory substantially extends earlier ones by allowing dependent and/or heavy-tailed errors and the covariates. As our main tools, we derive exponential tail probability inequalities for dependent sub-Gaussian errors and Nagaev-type inequalities for dependent non-sub-Gaussian errors that arise from linear or non-linear processes.

Article information

Source
Electron. J. Statist., Volume 10, Number 1 (2016), 352-379.

Dates
Received: March 2015
First available in Project Euclid: 17 February 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1455715966

Digital Object Identifier
doi:10.1214/16-EJS1108

Mathematical Reviews number (MathSciNet)
MR3466186

Zentralblatt MATH identifier
1333.62172

Keywords
Consistency dependence-adjusted norm exponential inequality functional and predictive dependence measures high-dimensional time series impulse response function Nagaev inequality predictive persistence support recovery

Citation

Wu, Wei-Biao; Wu, Ying Nian. Performance bounds for parameter estimates of high-dimensional linear models with correlated errors. Electron. J. Statist. 10 (2016), no. 1, 352--379. doi:10.1214/16-EJS1108. https://projecteuclid.org/euclid.ejs/1455715966


Export citation

References

  • [1] Azuma, K. (1967) Weighted sums of certain dependent random variables., Tohoku Mathematical Journal, 19, 357–367.
  • [2] Barnett, W. A., Chae, U. and Keating, J. (2012) Forecast design in monetary capital stock measurement., Global Journal of Economics, 1, 1250005.
  • [3] Basu, S. and Michailidis, G. (2015) Regularized estimation in sparse high-dimensional time series models., Annals of Statistics, 43, 1535–1567.
  • [4] Bickel, P., Ritov, Y. and Tsybakov, A. (2009) Simultaneous analysis of Lasso and Dantzig selector., Annals of Statistics, 37, 1705–1732.
  • [5] Bühlmann, P. and van de Geer, S. (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications., Springer.
  • [6] Bunea, F., Tsybakov, A., and Wegkamp, M. (2007) Sparsity oracle inequalities for the lasso., Electronic Journal of Statistics, 1, 169–194.
  • [7] Burkholder, D. L. (1973) Distribution function inequalities for martingales., Annals of Probability, 1, 19–42.
  • [8] Cai, T., Liu, W. and Luo, X. (2011) A constrained $\ell_1$ minimization approach to sparse precision matrix estimation., Journal of American Statistical Association, 106, 594–607.
  • [9] Candes E. and Tao, T. (2007) The Dantzig selector: statistical estimation when p is much larger than n (with discussion)., Annals of Statistics, 35, 2313–2404.
  • [10] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014) Testing many moment inequalities., http://arxiv.org/abs/1312.7614
  • [11] Chow, Y. S. and Teicher, H. (1988)., Probability Theory, 2nd ed. Springer, New York.
  • [12] Davis, R. A., Zang, P., and Zheng, T. (2015) Sparse Vector Autoregressive Modeling., Journal of Computational and Graphical Statistics.
  • [13] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., Journal of American Statistical Association, 96, 1348–1360.
  • [14] Friedman, J. H., Hastie, T., and Tibshirani, R. (2010) Regularization paths for generalized linear models via coordinate descent., Journal of Statistical Software, 33, 1–22.
  • [15] Gupta, S. (2012) A note on the asymptotic distribution of LASSO estimator for correlated data., Sankhya A, 74, 10–28.
  • [16] Han F. and Liu, H. (2013) A direct estimation of high dimensional stationary vector autoregressions., http://arxiv.org/abs/1307.0293
  • [17] Hsiao, C. (1979) Autoregressive modeling of canadian money and income data., Journal of the American Statistical Association, 74, 553–560.
  • [18] Kaul, A. (2014) Lasso with long memory regression errors., Journal of Statistical Planning and Inference, 153, 11–26.
  • [19] Kock, A. and Callot, L. (2012) Oracle inequalities for high dimensional vector autoregressions, Research Paper 12, CREATES, Aarhus, University.
  • [20] Krolzig, H. M. (2003) General-to-specific model selection procedures for structural vector autoregressions., Oxford Bulletin of Economics and Statistics, 65, 769–801.
  • [21] Lesigne, E. and Volný, D. Large deviations for martingales., Stochastic Processes and their Applications, 96, 143–159, 2001.
  • [22] Liu, H. and Wang, L. (2012) TIGER: A Tuning-Insensitive Approach for Optimally Estimating Gaussian Graphical Models., http://arxiv.org/ abs/1209.2437
  • [23] Liu, W., Xiao, H., and Wu, W. B. (2013) Probability and moment inequalities under dependence., Statistica Sinica, 23, 1257–1272. doi:10.5705/ss.2011.287.
  • [24] Loh, P.-L. (2015) Statistical consistency and asymptotic normality for high-dimensional robust $M$-estimators., http://arxiv.org/abs/1501.00312
  • [25] Loh, P.-L. and Wainwright, M. J. (2012), High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity., Ann. Stat., 40, 1637–1664.
  • [26] Lütkepohl, H. (2005), New introduction to multiple time series analysis, Springer.
  • [27] Meinshausen, N. and Bühlmann, P. (2006) High dimensional graphs and variable selection with the Lasso., Annals of Statistics, 34, 1436–1462.
  • [28] Meinshausen, N. and Yu, B. (2009) Lasso–type recovery of sparse representations for high-dimensional data., Annals of Statistics, 37, 246–270.
  • [29] Merlevede, F., Peligrad, M., and Rio, E. (2011) A Bernstein type inequality and moderate deviations for weakly dependent sequences., Probability Theory and Related Fields, 141, 435–474.
  • [30] Nagaev, S. V. (1979). Large deviations of sums of independent random variables., Annals of Probability, 7, 745–789.
  • [31] Nardi, Y. and Rinaldo, A. (2011) Autoregressive process modeling via the lasso procedure., Journal of Multivariate Analysis, 102, 528–549.
  • [32] Peligrad, M., Sang, H., Zhong, Y., and Wu, W. B. (2014) Exact moderate and large deviations for linear processes, Statistica Sinica, 24, 957–1969.
  • [33] Ravikumar, P., Wainwright, M. J., Raskutti, G., and Yu, B. (2011) High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence., Electronic Journal of Statistics, 5, 935–980.
  • [34] Sims, C. (1980) Macroeconomics and reality., Econometrica, 48, 1–48.
  • [35] Rosenthal, H. P. (1970) On the subspaces of $L_p$ ($p>2$) spanned by sequences of independent random variables., Israel Journal of Mathematics, 8, 273–303.
  • [36] Song, S. and Bickel, P. J. (2011) Large vector autoregressions., http://arxiv.org/abs/1106.3915
  • [37] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society: B, 58, 267–288.
  • [38] Tong, H. (1990), Nonlinear time series analysis: A dynamic approach, Oxford University Press, Oxford.
  • [39] Vershynin, R. (2012), Introduction to the non-asymptotic analysis of random matrices. In Chapter 5 of: Compressed Sensing, Theory and Applications. pp. 210–268. Edited by Y. Eldar and G. Kutyniok. Cambridge University Press.
  • [40] Wainwright, M. J. (2009). Sharp thresholds for noisy and high-dimensional recovery of sparsity using constrained quadratic programming (Lasso)., IEEE Transactions on Information Theory, 55, 2183–2202.
  • [41] Wang, H., Li, G., and Tsai, C.-L. (2007) Regression coefficient and autoregressive order shrinkage and selection via the lasso., Journal of Royal Statistical Society, B, 69, 63–78.
  • [42] Wiener, N. (1958), Nonlinear Problems in Random Theory. MIT Press.
  • [43] Wu, W. B. (2005) Nonlinear system theory: another look at dependence., Proceedings of National Academy of Science, 102, 14150–14154.
  • [44] Wu, W. B. (2011) Asymptotic theory for stationary processes, Statistics and Its Interface, 4, 207–226.
  • [45] Wu, W. B. and Shao, X. (2004) Limit theorems for iterated random functions., Journal of Applied Probability, 41, 425–436.
  • [46] Zhao, P. and Yu, B. (2006) On model selection consistency of Lasso., Journal of Machine Learning Research, 7, 2541–2567.
  • [47] Zhang, C. H. (2010) Nearly unbiased variable selection under minimax concave penalty., Ann. Statist., 38, 894–942.
  • [48] Zhang, C. H. and Huang, J. (2008) The sparsity and bias of the Lasso selection in high dimensional linear regression., Annals of Statistics, 36, 1567–1594.
  • [49] Zhang, T. (2009) Some sharp performance bounds for least squares regression with l1 regularization., Annals of Statistics, 37, 2109–2144.