Estimating the joint distribution of independent categorical variables via model selection

C. Durot; E. Lebarbier; A.-S. Tocquet

doi:10.3150/08-BEJ155

May 2009 Estimating the joint distribution of independent categorical variables via model selection

C. Durot, E. Lebarbier, A.-S. Tocquet

Bernoulli 15(2): 475-507 (May 2009). DOI: 10.3150/08-BEJ155

Abstract

Assume one observes independent categorical variables or, equivalently, one observes the corresponding multinomial variables. Estimating the distribution of the observed sequence amounts to estimating the expectation of the multinomial sequence. A new estimator for this mean is proposed that is nonparametric, non-asymptotic and implementable even for large sequences. It is a penalized least-squares estimator based on wavelets, with a penalization term inspired by papers of Birgé and Massart. The estimator is proved to satisfy an oracle inequality and to be adaptive in the minimax sense over a class of Besov bodies. The method is embedded in a general framework which allows us to recover also an existing method for segmentation. Beyond theoretical results, a simulation study is reported and an application on real data is provided.

Citation

Download Citation

C. Durot. E. Lebarbier. A.-S. Tocquet. "Estimating the joint distribution of independent categorical variables via model selection." Bernoulli 15 (2) 475 - 507, May 2009. https://doi.org/10.3150/08-BEJ155

Information

Published: May 2009

First available in Project Euclid: 4 May 2009

zbMATH: 1200.62024

MathSciNet: MR2543871

Digital Object Identifier: 10.3150/08-BEJ155

Keywords: Adaptive estimator , categorical variable , least-squares estimator , Model selection , multinomial variable , penalized minimum contrast estimator , ‎wavelet

Access the abstract

JOURNAL ARTICLE
33 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY