The Annals of Statistics

Ridgelets: estimating with ridge functions

Emmanuel J. Candès

Full-text: Open access


Feedforward neural networks, projection pursuit regression, and more generally, estimation via ridge functions have been proposed as an approach to bypass the curse of dimensionality and are now becoming widely applied to approximation or prediction in applied sciences. To address problems inherent to these methods--ranging from the construction of neural networks to their efficiency and capability--Candès [Appl. Comput. armon. Anal. 6 (1999) 197-218] developed a new system that allows the representation of arbitrary functions as superpositions of specific ridge functions, the ridgelets.

In a nonparametric regression setting, this article suggests expanding noisy data into a ridgelet series and applying a scalar nonlinearity to the coefficients (damping); this is unlike existing approaches based on stepwise additions of elements. The procedure is simple, constructive, stable and spatially adaptive--and fast algorithms have been developed to implement it.

The ridgelet estimator is nearly optimal for estimating functions with certain kinds of spatial inhomogeneities. In addition, ridgelets help to identify new classes of estimands--corresponding to a new notion of smoothness--that are well suited for ridge functions estimation. While the results are stated in a decision theoretic framework, numerical experiments are also presented to illustrate the practical performance of the methodology.

Article information

Ann. Statist., Volume 31, Number 5 (2003), 1561-1599.

First available in Project Euclid: 9 October 2003

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G07: Density estimation 62C20: Minimax procedures
Secondary: 41A30: Approximation by other special function classes

Nonparametric regression ridgelets ridge functions projection pursuit regression minimax decision theory radon transform spatial inhomogeneities edges thresholding of ridgelet coefficients


Candès, Emmanuel J. Ridgelets: estimating with ridge functions. Ann. Statist. 31 (2003), no. 5, 1561--1599. doi:10.1214/aos/1065705119.

Export citation


  • Barron, A. R. (1991). Complexity regularization with application to artificial neural networks. In Nonparametric Functional Estimation and Related Topics (G. Roussas, ed.) 561--576. Kluwer, Dordrecht.
  • Candès, E. J. (1998). Ridgelets: Theory and applications. Ph.D. dissertation, Dept. Statistics, Stanford Univ.
  • Candès, E. J. (1999a). Harmonic analysis of neural netwoks. Appl. Comput. Harmon. Anal. 6 197--218.
  • Candès, E. J. (1999b). Monoscale ridgelets for the representation of images with edges. Technical report, Dept. Statistics, Stanford Univ.
  • Candès, E. J. (2001). Ridgelets and the representation of mutilated Sobolev functions. SIAM J. Math. Anal. 33 347--368.
  • Candès, E. J. (2002). New ties between computational harmonic analysis and approximation theory. In Approximation Theory X (C. K. Chui, L. L. Schumaker and J. Stöckler, eds.) 87--153. Vanderbilt Univ. Press, Nashville, TN.
  • Candès, E. J. and Donoho, D. L. (2000). Curvelets---A surprisingly effective nonadaptive representation of objects with edges. In Curve and Surface Fitting (A. Cohen, C. Rabut and L. L. Schumaker, eds.) 105--120. Vanderbilt Univ. Press, Nashville, TN.
  • Cheng, B. and Titterington, D. M. (1994). Neural networks: A review from a statistical perspective (with discussion). Statist. Sci. 9 2--54.
  • Conway, J. H. and Sloane, N. J. A. (1988). Sphere Packings, Lattices and Groups. Springer, New York.
  • Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Math. Control Signals Systems 2 303--314.
  • Deans, S. R. (1983). The Radon Transform and Some of Its Applications. Wiley, New York.
  • Donoho, D. L. (1993). Unconditional bases are optimal bases for data compression and for statistical estimation. Appl. Comput. Harmon. Anal. 1 100--115.
  • Donoho, D. L. (1998). Digital ridgelet transform via rectopolar coordinate transform. Technical report, Dept. Statistics, Stanford Univ.
  • Donoho, D. L. and Johnstone, I. M. (1989). Projection-based approximation and a duality with kernel methods. Ann. Statist. 17 58--106.
  • Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 425--455.
  • Donoho, D. L. and Johnstone, I. M. (1995). Empirical atomic decomposition. Unpublished manuscript.
  • Donoho, D. L. and Johnstone, I. M. (1998). Minimax estimation via wavelet shrinkage. Ann. Statist. 26 879--921.
  • Efroimovich, S. and Pinsker, M. (1982). Estimation of square-integrable density on the basis of a sequence of observations. Problems Inform. Transmission 17 182--196.
  • Friedman, J. H. and Stuetzle, W. (1981). Projection pursuit regression. J. Amer. Statist. Assoc. 76 817--823.
  • Härdle, W. (1990). Applied Nonparametric Regression. Cambridge Univ. Press.
  • Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (1998). Wavelets, Approximation, and Statistical Applications. Lecture Notes in Statist. 129. Springer, New York.
  • Ibragimov, I. A. and Hasminskii, R. Z. (1981). Statistical Estimation. Asymptotic Theory. Springer, New York.
  • Johnstone, I. M. (1999). Wavelets and the theory of nonparametric function estimation. Available at
  • Johnstone, I. M. and Silverman, B. W. (1997). Wavelet threshold estimators for data with correlated noise. J. Roy. Statist. Soc. Ser. B 59 319--351.
  • Jones, L. K. (1997). The computational intractability of training sigmoidal neural networks. IEEE Trans. Inform. Theory 43 167--173.
  • Korostelev, A. P. and Tsybakov, A. B. (1993). Minimax Theory of Image Reconstruction. Lecture Notes in Statist. 82. Springer, New York.
  • Meyer, Y. (1992). Wavelets and Operators. Cambridge Univ. Press.
  • Pinsker, M. (1980). Optimal filtering of square integrable signals in Gaussian white noise. Problems Inform. Transm. 16 120--133.
  • Silverman, B. (1999). Wavelets in statistics: Beyond the standard assumptions. R. Soc. Lond. Philos. Trans. Ser. A Math. Phys. Eng. Sci. 357 2459--2473.
  • Starck, J., Candès, E. and Donoho, D. (2002). The curvelet transform for image denoising. IEEE Trans. Image Process. 11 670--684.
  • Stone, C. J. (1977). Consistent nonparametric regression (with discussion). Ann. Statist. 5 595--645.
  • Vu, V. H. (1998). On the infeasibility of training neural networks with small mean squared error. IEEE Trans. Inform. Theory 44 2892--2900.