Abstract
The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In nonparametric regression, one assumes that the regression function belongs to a prespecified infinite-dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal mean squared error (MSE) under a set of simple and direct conditions. The proposed estimator can be constructed with a low computational (time and space) expense: We also formally show that Sieve-SGD requires almost minimal memory usage among all statistically rate-optimal estimators.
Funding Statement
N. Simon and T. Zhang were both supported by NIH grant R01HL137808.
Acknowledgments
The authors would like to thank the anonymous referees, an associate editor and the editor for their constructive comments that improved an early version of this paper.
Citation
Tianyu Zhang. Noah Simon. "A sieve stochastic gradient descent estimator for online nonparametric regression in Sobolev ellipsoids." Ann. Statist. 50 (5) 2848 - 2871, October 2022. https://doi.org/10.1214/22-AOS2212
Information