Density-sensitive semisupervised inference

Martin Azizyan; Aarti Singh; Larry Wasserman

doi:10.1214/13-AOS1092

April 2013 Density-sensitive semisupervised inference

Martin Azizyan, Aarti Singh, Larry Wasserman

Ann. Statist. 41(2): 751-771 (April 2013). DOI: 10.1214/13-AOS1092

Abstract

Semisupervised methods are techniques for using labeled data $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ together with unlabeled data $X_{n+1},\ldots,X_{N}$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_{X}$ of $X$ to the regression function $f(x)$. For example, it is common to assume that $f$ is very smooth over high density regions of $P_{X}$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_{X}$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.

Citation

Download Citation

Martin Azizyan. Aarti Singh. Larry Wasserman. "Density-sensitive semisupervised inference." Ann. Statist. 41 (2) 751 - 771, April 2013. https://doi.org/10.1214/13-AOS1092