Open Access
April 2013 Density-sensitive semisupervised inference
Martin Azizyan, Aarti Singh, Larry Wasserman
Ann. Statist. 41(2): 751-771 (April 2013). DOI: 10.1214/13-AOS1092

Abstract

Semisupervised methods are techniques for using labeled data $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ together with unlabeled data $X_{n+1},\ldots,X_{N}$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_{X}$ of $X$ to the regression function $f(x)$. For example, it is common to assume that $f$ is very smooth over high density regions of $P_{X}$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_{X}$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.

Citation

Download Citation

Martin Azizyan. Aarti Singh. Larry Wasserman. "Density-sensitive semisupervised inference." Ann. Statist. 41 (2) 751 - 771, April 2013. https://doi.org/10.1214/13-AOS1092

Information

Published: April 2013
First available in Project Euclid: 8 May 2013

zbMATH: 1267.62057
MathSciNet: MR3099120
Digital Object Identifier: 10.1214/13-AOS1092

Subjects:
Primary: 62G15
Secondary: 62G07

Keywords: efficiency , kernel density , nonparametric inference , semisupervised

Rights: Copyright © 2013 Institute of Mathematical Statistics

Vol.41 • No. 2 • April 2013
Back to Top