Electronic Journal of Statistics Articles (Project Euclid)
http://projecteuclid.org/euclid.ejs
The latest articles from Electronic Journal of Statistics on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2010 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Thu, 05 Aug 2010 15:41 EDTFri, 03 Jun 2011 09:20 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
The bias and skewness of M -estimators in regression
http://projecteuclid.org/euclid.ejs/1262876992
<strong>Christopher Withers</strong>, <strong>Saralees Nadarajah</strong><p><strong>Source: </strong>Electron. J. Statist., Volume 4, 1--14.</p><p><strong>Abstract:</strong><br/>
We consider M estimation of a regression model with a nuisance parameter and a vector of other parameters. The unknown distribution of the residuals is not assumed to be normal or symmetric. Simple and easily estimated formulas are given for the dominant terms of the bias and skewness of the parameter estimates. For the linear model these are proportional to the skewness of the ‘independent’ variables. For a nonlinear model, its linear component plays the role of these independent variables, and a second term must be added proportional to the covariance of its linear and quadratic components. For the least squares estimate with normal errors this term was derived by Box [1]. We also consider the effect of a large number of parameters, and the case of random independent variables.
</p>projecteuclid.org/euclid.ejs/1262876992_Thu, 05 Aug 2010 15:41 EDTThu, 05 Aug 2010 15:41 EDTAdaptive estimation in the supremum norm for semiparametric mixtures of regressionshttps://projecteuclid.org/euclid.ejs/1587693633<strong>Heiko Werner</strong>, <strong>Hajo Holzmann</strong>, <strong>Pierre Vandekerkhove</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1816--1871.</p><p><strong>Abstract:</strong><br/>
We investigate a flexible two-component semiparametric mixture of regressions model, in which one of the conditional component distributions of the response given the covariate is unknown but assumed symmetric about a location parameter, while the other is specified up to a scale parameter. The location and scale parameters together with the proportion are allowed to depend nonparametrically on covariates. After settling identifiability, we provide local M-estimators for these parameters which converge in the sup-norm at the optimal rates over Hölder-smoothness classes. We also introduce an adaptive version of the estimators based on the Lepski-method. Sup-norm bounds show that the local M-estimator properly estimates the functions globally, and are the first step in the construction of useful inferential tools such as confidence bands. In our analysis we develop general results about rates of convergence in the sup-norm as well as adaptive estimation of local M-estimators which might be of some independent interest, and which can also be applied in various other settings. We investigate the finite-sample behaviour of our method in a simulation study, and give an illustration to a real data set from bioinformatics.
</p>projecteuclid.org/euclid.ejs/1587693633_20200423220103Thu, 23 Apr 2020 22:01 EDTKaplan-Meier V- and U-statisticshttps://projecteuclid.org/euclid.ejs/1587693634<strong>Tamara Fernández</strong>, <strong>Nicolás Rivera</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1872--1916.</p><p><strong>Abstract:</strong><br/>
In this paper, we study Kaplan-Meier V- and U-statistics respectively defined as $\theta (\widehat{F}_{n})=\sum _{i,j}K(X_{[i:n]},X_{[j:n]})W_{i}W_{j}$ and $\theta _{U}(\widehat{F}_{n})=\sum _{i\neq j}K(X_{[i:n]},X_{[j:n]})W_{i}W_{j}/\sum _{i\neq j}W_{i}W_{j}$, where $\widehat{F}_{n}$ is the Kaplan-Meier estimator, $\{W_{1},\ldots ,W_{n}\}$ are the Kaplan-Meier weights and $K:(0,\infty )^{2}\to \mathbb{R}$ is a symmetric kernel. As in the canonical setting of uncensored data, we differentiate between two asymptotic behaviours for $\theta (\widehat{F}_{n})$ and $\theta _{U}(\widehat{F}_{n})$. Additionally, we derive an asymptotic canonical V-statistic representation of the Kaplan-Meier V- and U-statistics. By using this representation we study properties of the asymptotic distribution. Applications to hypothesis testing are given.
</p>projecteuclid.org/euclid.ejs/1587693634_20200423220103Thu, 23 Apr 2020 22:01 EDTEstimation of linear projections of non-sparse coefficients in high-dimensional regressionhttps://projecteuclid.org/euclid.ejs/1578366077<strong>David Azriel</strong>, <strong>Armin Schwartzman</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 174--206.</p><p><strong>Abstract:</strong><br/>
In this work we study estimation of signals when the number of parameters is much larger than the number of observations. A large body of literature assumes for these kind of problems a sparse structure where most of the parameters are zero or close to zero. When this assumption does not hold, one can focus on low-dimensional functions of the parameter vector. In this work we study one-dimensional linear projections. Specifically, in the context of high-dimensional linear regression, the parameter of interest is ${\boldsymbol{\beta}}$ and we study estimation of $\mathbf{a}^{T}{\boldsymbol{\beta}}$. We show that $\mathbf{a}^{T}\hat{\boldsymbol{\beta}}$, where $\hat{\boldsymbol{\beta}}$ is the least squares estimator, using pseudo-inverse when $p>n$, is minimax and admissible. Thus, for linear projections no regularization or shrinkage is needed. This estimator is easy to analyze and confidence intervals can be constructed. We study a high-dimensional dataset from brain imaging where it is shown that the signal is weak, non-sparse and significantly different from zero.
</p>projecteuclid.org/euclid.ejs/1578366077_20200427220233Mon, 27 Apr 2020 22:02 EDTPerspective maximum likelihood-type estimation via proximal decompositionhttps://projecteuclid.org/euclid.ejs/1578452535<strong>Patrick L. Combettes</strong>, <strong>Christian L. Müller</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 207--238.</p><p><strong>Abstract:</strong><br/>
We introduce a flexible optimization model for maximum likelihood-type estimation (M-estimation) that encompasses and generalizes a large class of existing statistical models, including Huber’s concomitant M-estimator, Owen’s Huber/Berhu concomitant estimator, the scaled lasso, support vector machine regression, and penalized estimation with structured sparsity. The model, termed perspective M-estimation, leverages the observation that convex M-estimators with concomitant scale as well as various regularizers are instances of perspective functions, a construction that extends a convex function to a jointly convex one in terms of an additional scale variable. These nonsmooth functions are shown to be amenable to proximal analysis, which leads to principled and provably convergent optimization algorithms via proximal splitting. We derive novel proximity operators for several perspective functions of interest via a geometrical approach based on duality. We then devise a new proximal splitting algorithm to solve the proposed M-estimation problem and establish the convergence of both the scale and regression iterates it produces to a solution. Numerical experiments on synthetic and real-world data illustrate the broad applicability of the proposed framework.
</p>projecteuclid.org/euclid.ejs/1578452535_20200427220233Mon, 27 Apr 2020 22:02 EDTBayesian variance estimation in the Gaussian sequence model with partial information on the meanshttps://projecteuclid.org/euclid.ejs/1578452536<strong>Gianluca Finocchio</strong>, <strong>Johannes Schmidt-Hieber</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 239--271.</p><p><strong>Abstract:</strong><br/>
Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $\sigma^{2}$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy that uses refined properties of the posterior distribution, we find that the marginal posterior is inconsistent for any i.i.d. prior on the mean parameters. In particular, no assumption on the decay of the prior needs to be imposed. Surprisingly, we also find that consistency can be retained for a hierarchical prior based on Gaussian mixtures. In this case we also establish a limiting shape result and determine the limit distribution. In contrast to the classical Bernstein-von Mises theorem, the limit is non-Gaussian. We show that the Bayesian analysis leads to new statistical estimators outperforming the correctly calibrated MLE in a numerical simulation study.
</p>projecteuclid.org/euclid.ejs/1578452536_20200427220233Mon, 27 Apr 2020 22:02 EDTAssessing prediction error at interpolation and extrapolation pointshttps://projecteuclid.org/euclid.ejs/1578452537<strong>Assaf Rabinowicz</strong>, <strong>Saharon Rosset</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 272--301.</p><p><strong>Abstract:</strong><br/>
Common model selection criteria, such as $AIC$ and its variants, are based on in-sample prediction error estimators. However, in many applications involving predicting at interpolation and extrapolation points, in-sample error does not represent the relevant prediction error. In this paper new prediction error estimators, $tAI$ and $Loss(w_{t})$ are introduced. These estimators generalize previous error estimators, however are also applicable for assessing prediction error in cases involving interpolation and extrapolation. Based on these prediction error estimators, two model selection criteria with the same spirit as $AIC$ and Mallow’s $C_{p}$ are suggested. The advantages of our suggested methods are demonstrated in a simulation and a real data analysis of studies involving interpolation and extrapolation in linear mixed model and Gaussian process regression.
</p>projecteuclid.org/euclid.ejs/1578452537_20200427220233Mon, 27 Apr 2020 22:02 EDTAsymptotics and optimal bandwidth for nonparametric estimation of density level setshttps://projecteuclid.org/euclid.ejs/1578474214<strong>Wanli Qiao</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 302--344.</p><p><strong>Abstract:</strong><br/>
Bandwidth selection is crucial in the kernel estimation of density level sets. A risk based on the symmetric difference between the estimated and true level sets is usually used to measure their proximity. In this paper we provide an asymptotic $L^{p}$ approximation to this risk, where $p$ is characterized by the weight function in the risk. In particular the excess risk corresponds to an $L^{2}$ type of risk, and is adopted to derive an optimal bandwidth for nonparametric level set estimation of $d$-dimensional density functions ($d\geq 1$). A direct plug-in bandwidth selector is developed for kernel density level set estimation and its efficacy is verified in numerical studies.
</p>projecteuclid.org/euclid.ejs/1578474214_20200427220233Mon, 27 Apr 2020 22:02 EDTSparse equisigned PCA: Algorithms and performance bounds in the noisy rank-1 settinghttps://projecteuclid.org/euclid.ejs/1579078827<strong>Arvind Prasadan</strong>, <strong>Raj Rao Nadakuditi</strong>, <strong>Debashis Paul</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 345--385.</p><p><strong>Abstract:</strong><br/>
Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider a setting where the left singular vector of the underlying rank one signal matrix is assumed to be sparse and the right singular vector is assumed to be equisigned, that is, having either only nonnegative or only nonpositive entries. We consider six different algorithms for estimating the sparse principal component based on different statistical criteria and prove that by exploiting sparsity, we recover consistent estimates in the low eigen-SNR regime where the SVD fails. Our analysis reveals conditions under which a coordinate selection scheme based on a sum-type decision statistic outperforms schemes that utilize the $\ell _{1}$ and $\ell _{2}$ norm-based statistics. We derive lower bounds on the size of detectable coordinates of the principal left singular vector and utilize these lower bounds to derive lower bounds on the worst-case risk. Finally, we verify our findings with numerical simulations and a illustrate the performance with a video data where the interest is in identifying objects.
</p>projecteuclid.org/euclid.ejs/1579078827_20200427220233Mon, 27 Apr 2020 22:02 EDTUnivariate mean change point detection: Penalization, CUSUM and optimalityhttps://projecteuclid.org/euclid.ejs/1588039326<strong>Daren Wang</strong>, <strong>Yi Yu</strong>, <strong>Alessandro Rinaldo</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1917--1961.</p><p><strong>Abstract:</strong><br/>
The problem of univariate mean change point detection and localization based on a sequence of $n$ independent observations with piecewise constant means has been intensively studied for more than half century, and serves as a blueprint for change point problems in more complex settings. We provide a complete characterization of this classical problem in a general framework in which the upper bound $\sigma ^{2}$ on the noise variance, the minimal spacing $\Delta $ between two consecutive change points and the minimal magnitude $\kappa $ of the changes, are allowed to vary with $n$. We first show that consistent localization of the change points is impossible in the low signal-to-noise ratio regime $\frac{\kappa \sqrt{\Delta }}{\sigma }\preceq \sqrt{\log (n)}$. In contrast, when $\frac{\kappa \sqrt{\Delta }}{\sigma }$ diverges with $n$ at the rate of at least $\sqrt{\log (n)}$, we demonstrate that two computationally-efficient change point estimators, one based on the solution to an $\ell _{0}$-penalized least squares problem and the other on the popular wild binary segmentation algorithm, are both consistent and achieve a localization rate of the order $\frac{\sigma ^{2}}{\kappa ^{2}}\log (n)$. We further show that such rate is minimax optimal, up to a $\log (n)$ term.
</p>projecteuclid.org/euclid.ejs/1588039326_20200427220233Mon, 27 Apr 2020 22:02 EDTAsymptotic properties of the maximum likelihood and cross validation estimators for transformed Gaussian processeshttps://projecteuclid.org/euclid.ejs/1588039327<strong>François Bachoc</strong>, <strong>José Betancourt</strong>, <strong>Reinhard Furrer</strong>, <strong>Thierry Klein</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1962--2008.</p><p><strong>Abstract:</strong><br/>
The asymptotic analysis of covariance parameter estimation of Gaussian processes has been subject to intensive investigation. However, this asymptotic analysis is very scarce for non-Gaussian processes. In this paper, we study a class of non-Gaussian processes obtained by regular non-linear transformations of Gaussian processes. We provide the increasing-domain asymptotic properties of the (Gaussian) maximum likelihood and cross validation estimators of the covariance parameters of a non-Gaussian process of this class. We show that these estimators are consistent and asymptotically normal, although they are defined as if the process was Gaussian. They do not need to model or estimate the non-linear transformation. Our results can thus be interpreted as a robustness of (Gaussian) maximum likelihood and cross validation towards non-Gaussianity. Our proofs rely on two technical results that are of independent interest for the increasing-domain asymptotic literature of spatial processes. First, we show that, under mild assumptions, coefficients of inverses of large covariance matrices decay at an inverse polynomial rate as a function of the corresponding observation location distances. Second, we provide a general central limit theorem for quadratic forms obtained from transformed Gaussian processes. Finally, our asymptotic results are illustrated by numerical simulations.
</p>projecteuclid.org/euclid.ejs/1588039327_20200427220233Mon, 27 Apr 2020 22:02 EDTConsistent model selection criteria and goodness-of-fit test for common time series modelshttps://projecteuclid.org/euclid.ejs/1588039328<strong>Jean-Marc Bardet</strong>, <strong>Kare Kamila</strong>, <strong>William Kengne</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2009--2052.</p><p><strong>Abstract:</strong><br/>
This paper studies the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty $) processes, as well as the GARCH or ARCH($\infty $), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on a Portmanteau test. Monte-Carlo experiments and numerical applications on illustrative examples are performed to highlight the obtained asymptotic results. Moreover, using a data-driven choice of the penalty, they show the practical efficiency of this new model selection procedure and Portemanteau test.
</p>projecteuclid.org/euclid.ejs/1588039328_20200427220233Mon, 27 Apr 2020 22:02 EDTParseval inequalities and lower bounds for variance-based sensitivity indiceshttps://projecteuclid.org/euclid.ejs/1579662085<strong>Olivier Roustant</strong>, <strong>Fabrice Gamboa</strong>, <strong>Bertrand Iooss</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 386--412.</p><p><strong>Abstract:</strong><br/>
The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol’ sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol’ indices with Parseval equalities and give general lower bounds for these indices obtained by truncation. The case of the eigenfunctions system associated with a Poincaré differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy.
</p>projecteuclid.org/euclid.ejs/1579662085_20200505220052Tue, 05 May 2020 22:00 EDTRecovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approachhttps://projecteuclid.org/euclid.ejs/1579662086<strong>Ming Yu</strong>, <strong>Varun Gupta</strong>, <strong>Mladen Kolar</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 413--457.</p><p><strong>Abstract:</strong><br/>
We study the problem of recovery of matrices that are simultaneously low rank and row and/or column sparse. Such matrices appear in recent applications in cognitive neuroscience, imaging, computer vision, macroeconomics, and genetics. We propose a GDT (Gradient Descent with hard Thresholding) algorithm to efficiently recover matrices with such structure, by minimizing a bi-convex function over a nonconvex set of constraints. We show linear convergence of the iterates obtained by GDT to a region within statistical error of an optimal solution. As an application of our method, we consider multi-task learning problems and show that the statistical error rate obtained by GDT is near optimal compared to minimax rate. Experiments demonstrate competitive performance and much faster running speed compared to existing methods, on both simulations and real data sets.
</p>projecteuclid.org/euclid.ejs/1579662086_20200505220052Tue, 05 May 2020 22:00 EDTOn polyhedral estimation of signals via indirect observationshttps://projecteuclid.org/euclid.ejs/1579834964<strong>Anatoli Juditsky</strong>, <strong>Arkadi Nemirovski</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 458--502.</p><p><strong>Abstract:</strong><br/>
We consider the problem of recovering linear image of unknown signal belonging to a given convex compact signal set from noisy observation of another linear image of the signal. We develop a simple generic efficiently computable non linear in observations “polyhedral” estimate along with computation-friendly techniques for its design and risk analysis. We demonstrate that under favorable circumstances the resulting estimate is provably near-optimal in the minimax sense, the “favorable circumstances” being less restrictive than the weakest known so far assumptions ensuring near-optimality of estimates which are linear in observations.
</p>projecteuclid.org/euclid.ejs/1579834964_20200505220052Tue, 05 May 2020 22:00 EDTGaussian field on the symmetric group: Prediction and learninghttps://projecteuclid.org/euclid.ejs/1580202025<strong>François Bachoc</strong>, <strong>Baptiste Broto</strong>, <strong>Fabrice Gamboa</strong>, <strong>Jean-Michel Loubes</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 503--546.</p><p><strong>Abstract:</strong><br/>
In the framework of the supervised learning of a real function defined on an abstract space $\mathcal{X}$, Gaussian processes are widely used. The Euclidean case for $\mathcal{X}$ is well known and has been widely studied. In this paper, we explore the less classical case where $\mathcal{X}$ is the non commutative finite group of permutations (namely the so-called symmetric group $S_{N}$). We provide an application to Gaussian process based optimization of Latin Hypercube Designs. We also extend our results to the case of partial rankings.
</p>projecteuclid.org/euclid.ejs/1580202025_20200505220052Tue, 05 May 2020 22:00 EDTDrift estimation for stochastic reaction-diffusion systemshttps://projecteuclid.org/euclid.ejs/1580202030<strong>Gregor Pasemann</strong>, <strong>Wilhelm Stannat</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 547--579.</p><p><strong>Abstract:</strong><br/>
A parameter estimation problem for a class of semilinear stochastic evolution equations is considered. Conditions for consistency and asymptotic normality are given in terms of growth and continuity properties of the nonlinear part. Emphasis is put on the case of stochastic reaction-diffusion systems. Robustness results for statistical inference under model uncertainty are provided.
</p>projecteuclid.org/euclid.ejs/1580202030_20200505220052Tue, 05 May 2020 22:00 EDTOn the Letac-Massam conjecture and existence of high dimensional Bayes estimators for graphical modelshttps://projecteuclid.org/euclid.ejs/1580202031<strong>Emanuel Ben-David</strong>, <strong>Bala Rajaratnam</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 580--604.</p><p><strong>Abstract:</strong><br/>
The Wishart distribution defined on the open cone of positive-definite matrices plays a central role in multivariate analysis and multivariate distribution theory. Its domain of parameters is often referred to as the Gindikin set. In recent years, varieties of useful extensions of the Wishart distribution have been proposed in the literature for the purposes of studying Markov random fields and graphical models. In particular, generalizations of the Wishart distribution, referred to as Type I and Type II (graphical) Wishart distributions introduced by Letac and Massam in Annals of Statistics (2007) play important roles in both frequentist and Bayesian inference for Gaussian graphical models. These distributions have been especially useful in high-dimensional settings due to the flexibility offered by their multiple-shape parameters. Concerning Type I and Type II Wishart distributions, a conjecture of Letac and Massam concerns the domain of multiple-shape parameters of these distributions. The conjecture also has implications for the existence of Bayes estimators corresponding to these high dimensional priors. The conjecture, which was first posed in the Annals of Statistics, has now been an open problem for about 10 years. In this paper, we give a necessary condition for the Letac and Massam conjecture to hold. More precisely, we prove that if the Letac and Massam conjecture holds on a decomposable graph, then no two separators of the graph can be nested within each other. For this, we analyze Type I and Type II Wishart distributions on appropriate Markov equivalent perfect DAG models and succeed in deriving the aforementioned necessary condition. This condition in particular identifies a class of counterexamples to the conjecture.
</p>projecteuclid.org/euclid.ejs/1580202031_20200505220052Tue, 05 May 2020 22:00 EDTGeneralised cepstral models for the spectrum of vector time serieshttps://projecteuclid.org/euclid.ejs/1580202032<strong>Maddalena Cavicchioli</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 605--631.</p><p><strong>Abstract:</strong><br/>
The paper treats the modeling of stationary multivariate stochastic processes via a frequency domain model expressed in terms of cepstrum theory. The proposed model nests the vector exponential model of [20] as a special case, and extends the generalised cepstral model of [36] to the multivariate setting, answering a question raised by the last authors in their paper. Contemporarily, we extend the notion of generalised autocovariance function of [35] to vector time series. Then we derive explicit matrix formulas connecting generalised cepstral and autocovariance matrices of the process, and prove the consistency and asymptotic properties of the Whittle likelihood estimators of model parameters. Asymptotic theory for the special case of the vector exponential model is a significant addition to the paper of [20]. We also provide a mathematical machinery, based on matrix differentiation, and computational methods to derive our results, which differ significantly from those employed in the univariate case. The utility of the proposed model is illustrated through Monte Carlo simulation from a bivariate process characterized by a high dynamic range, and an empirical application on time varying minimum variance hedge ratios through the second moments of future and spot prices in the corn commodity market.
</p>projecteuclid.org/euclid.ejs/1580202032_20200505220052Tue, 05 May 2020 22:00 EDTStatistical convergence of the EM algorithm on Gaussian mixture modelshttps://projecteuclid.org/euclid.ejs/1580202033<strong>Ruofei Zhao</strong>, <strong>Yuanzhi Li</strong>, <strong>Yuekai Sun</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 632--660.</p><p><strong>Abstract:</strong><br/>
We study the convergence behavior of the Expectation Maximization (EM) algorithm on Gaussian mixture models with an arbitrary number of mixture components and mixing weights. We show that as long as the means of the components are separated by at least $\Omega (\sqrt{\min \{M,d\}})$, where $M$ is the number of components and $d$ is the dimension, the EM algorithm converges locally to the global optimum of the log-likelihood. Further, we show that the convergence rate is linear and characterize the size of the basin of attraction to the global optimum.
</p>projecteuclid.org/euclid.ejs/1580202033_20200505220052Tue, 05 May 2020 22:00 EDTNonparametric confidence intervals for conditional quantiles with large-dimensional covariateshttps://projecteuclid.org/euclid.ejs/1580266939<strong>Laurent Gardes</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 661--701.</p><p><strong>Abstract:</strong><br/>
The first part of the paper is dedicated to the construction of a $\gamma$ - nonparametric confidence interval for a conditional quantile with a level depending on the sample size. When this level tends to 0 or 1 as the sample size increases, the conditional quantile is said to be extreme and is located in the tail of the conditional distribution. The proposed confidence interval is constructed by approximating the distribution of the order statistics selected with a nearest neighbor approach by a Beta distribution. We show that its coverage probability converges to the preselected probability $\gamma $ and its accuracy is illustrated on a simulation study. When the dimension of the covariate increases, the coverage probability of the confidence interval can be very different from $\gamma $. This is a well known consequence of the data sparsity especially in the tail of the distribution. In a second part, a dimension reduction procedure is proposed in order to select more appropriate nearest neighbors in the right tail of the distribution and in turn to obtain a better coverage probability for extreme conditional quantiles. This procedure is based on the Tail Conditional Independence assumption introduced in (Gardes, Extremes , pp. 57–95, 18(3) , 2018).
</p>projecteuclid.org/euclid.ejs/1580266939_20200505220052Tue, 05 May 2020 22:00 EDTThe limiting behavior of isotonic and convex regression estimators when the model is misspecifiedhttps://projecteuclid.org/euclid.ejs/1588730427<strong>Eunji Lim</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2053--2097.</p><p><strong>Abstract:</strong><br/>
We study the asymptotic behavior of the least squares estimators when the model is possibly misspecified. We consider the setting where we wish to estimate an unknown function $f_{*}:(0,1)^{d}\rightarrow \mathbb{R}$ from observations $(X,Y),(X_{1},Y_{1}),\cdots ,(X_{n},Y_{n})$; our estimator $\hat{g}_{n}$ is the minimizer of $\sum _{i=1}^{n}(Y_{i}-g(X_{i}))^{2}/n$ over $g\in \mathcal{G}$ for some set of functions $\mathcal{G}$. We provide sufficient conditions on the metric entropy of $\mathcal{G}$, under which $\hat{g}_{n}$ converges to $g_{*}$ as $n\rightarrow \infty $, where $g_{*}$ is the minimizer of $\|g-f_{*}\|\triangleq \mathbb{E}(g(X)-f_{*}(X))^{2}$ over $g\in \mathcal{G}$. As corollaries of our theorem, we establish $\|\hat{g}_{n}-g_{*}\|\rightarrow 0$ as $n\rightarrow \infty $ when $\mathcal{G}$ is the set of monotone functions or the set of convex functions. We also make a connection between the convergence rate of $\|\hat{g}_{n}-g_{*}\|$ and the metric entropy of $\mathcal{G}$. As special cases of our finding, we compute the convergence rate of $\|\hat{g}_{n}-g_{*}\|^{2}$ when $\mathcal{G}$ is the set of bounded monotone functions or the set of bounded convex functions.
</p>projecteuclid.org/euclid.ejs/1588730427_20200505220052Tue, 05 May 2020 22:00 EDTDetection of sparse positive dependencehttps://projecteuclid.org/euclid.ejs/1580353285<strong>Ery Arias-Castro</strong>, <strong>Rong Huang</strong>, <strong>Nicolas Verzelen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 702--730.</p><p><strong>Abstract:</strong><br/>
In a bivariate setting, we consider the problem of detecting a sparse contamination or mixture component, where the effect manifests itself as a positive dependence between the variables, which are otherwise independent in the main component. We first look at this problem in the context of a normal mixture model. In essence, the situation reduces to a univariate setting where the effect is a decrease in variance. In particular, a higher criticism test based on the pairwise differences is shown to achieve the detection boundary defined by the (oracle) likelihood ratio test. We then turn to a Gaussian copula model where the marginal distributions are unknown. Standard invariance considerations lead us to consider rank tests. In fact, a higher criticism test based on the pairwise rank differences achieves the detection boundary in the normal mixture model, although not in the very sparse regime. We do not know of any rank test that has any power in that regime.
</p>projecteuclid.org/euclid.ejs/1580353285_20200512220226Tue, 12 May 2020 22:02 EDTProfile likelihood biclusteringhttps://projecteuclid.org/euclid.ejs/1580461237<strong>Cheryl Flynn</strong>, <strong>Patrick Perry</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 731--768.</p><p><strong>Abstract:</strong><br/>
Biclustering, the process of simultaneously clustering the rows and columns of a data matrix, is a popular and effective tool for finding structure in a high-dimensional dataset. Many biclustering procedures appear to work well in practice, but most do not have associated consistency guarantees. To address this shortcoming, we propose a new biclustering procedure based on profile likelihood. The procedure applies to a broad range of data modalities, including binary, count, and continuous observations. We prove that the procedure recovers the true row and column classes when the dimensions of the data matrix tend to infinity, even if the functional form of the data distribution is misspecified. The procedure requires computing a combinatorial search, which can be expensive in practice. Rather than performing this search directly, we propose a new heuristic optimization procedure based on the Kernighan-Lin heuristic, which has nice computational properties and performs well in simulations. We demonstrate our procedure with applications to congressional voting records, and microarray analysis.
</p>projecteuclid.org/euclid.ejs/1580461237_20200512220226Tue, 12 May 2020 22:02 EDTEstimation of a semiparametric transformation model: A novel approach based on least squares minimizationhttps://projecteuclid.org/euclid.ejs/1580871775<strong>Benjamin Colling</strong>, <strong>Ingrid Van Keilegom</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 769--800.</p><p><strong>Abstract:</strong><br/>
Consider the following semiparametric transformation model $\Lambda_{\theta }(Y)=m(X)+\varepsilon $, where $X$ is a $d$-dimensional covariate, $Y$ is a univariate response variable and $\varepsilon $ is an error term with zero mean and independent of $X$. We assume that $m$ is an unknown regression function and that $\{\Lambda _{\theta }:\theta \in\Theta \}$ is a parametric family of strictly increasing functions. Our goal is to develop two new estimators of the transformation parameter $\theta $. The main idea of these two estimators is to minimize, with respect to $\theta $, the $L_{2}$-distance between the transformation $\Lambda _{\theta }$ and one of its fully nonparametric estimators. We consider in particular the nonparametric estimator based on the least-absolute deviation loss constructed in Colling and Van Keilegom (2019). We establish the consistency and the asymptotic normality of the two proposed estimators of $\theta $. We also carry out a simulation study to illustrate and compare the performance of our new parametric estimators to that of the profile likelihood estimator constructed in Linton et al. (2008).
</p>projecteuclid.org/euclid.ejs/1580871775_20200512220226Tue, 12 May 2020 22:02 EDTThe bias of isotonic regressionhttps://projecteuclid.org/euclid.ejs/1580871776<strong>Ran Dai</strong>, <strong>Hyebin Song</strong>, <strong>Rina Foygel Barber</strong>, <strong>Garvesh Raskutti</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 801--834.</p><p><strong>Abstract:</strong><br/>
We study the bias of the isotonic regression estimator. While there is extensive work characterizing the mean squared error of the isotonic regression estimator, relatively little is known about the bias. In this paper, we provide a sharp characterization, proving that the bias scales as $O(n^{-\beta /3})$ up to log factors, where $1\leq \beta \leq 2$ is the exponent corresponding to Hölder smoothness of the underlying mean. Importantly, this result only requires a strictly monotone mean and that the noise distribution has subexponential tails, without relying on symmetric noise or other restrictive assumptions.
</p>projecteuclid.org/euclid.ejs/1580871776_20200512220226Tue, 12 May 2020 22:02 EDTModal clustering asymptotics with applications to bandwidth selectionhttps://projecteuclid.org/euclid.ejs/1581130993<strong>Alessandro Casa</strong>, <strong>José E. Chacón</strong>, <strong>Giovanna Menardi</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 835--856.</p><p><strong>Abstract:</strong><br/>
Density-based clustering relies on the idea of linking groups to some specific features of the probability distribution underlying the data. The reference to a true, yet unknown, population structure allows framing the clustering problem in a standard inferential setting, where the concept of ideal population clustering is defined as the partition induced by the true density function. The nonparametric formulation of this approach, known as modal clustering, draws a correspondence between the groups and the domains of attraction of the density modes. Operationally, a nonparametric density estimate is required and a proper selection of the amount of smoothing, governing the shape of the density and hence possibly the modal structure, is crucial to identify the final partition. In this work, we address the issue of density estimation for modal clustering from an asymptotic perspective. A natural and easy to interpret metric to measure the distance between density-based partitions is discussed, its asymptotic approximation explored, and employed to study the problem of bandwidth selection for nonparametric modal clustering.
</p>projecteuclid.org/euclid.ejs/1581130993_20200512220226Tue, 12 May 2020 22:02 EDTOn a Metropolis–Hastings importance sampling estimatorhttps://projecteuclid.org/euclid.ejs/1581325278<strong>Daniel Rudolf</strong>, <strong>Björn Sprungk</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 857--889.</p><p><strong>Abstract:</strong><br/>
A classical approach for approximating expectations of functions w.r.t. partially known distributions is to compute the average of function values along a trajectory of a Metropolis–Hastings (MH) Markov chain. A key part in the MH algorithm is a suitable acceptance/rejection of a proposed state, which ensures the correct stationary distribution of the resulting Markov chain. However, the rejection of proposals causes highly correlated samples. In particular, when a state is rejected it is not taken any further into account. In contrast to that we consider a MH importance sampling estimator which explicitly incorporates all proposed states generated by the MH algorithm. The estimator satisfies a strong law of large numbers as well as a central limit theorem, and, in addition to that, we provide an explicit mean squared error bound. Remarkably, the asymptotic variance of the MH importance sampling estimator does not involve any correlation term in contrast to its classical counterpart. Moreover, although the analyzed estimator uses the same amount of information as the classical MH estimator, it can outperform the latter in scenarios of moderate dimensions as indicated by numerical experiments.
</p>projecteuclid.org/euclid.ejs/1581325278_20200512220226Tue, 12 May 2020 22:02 EDTReduction problems and deformation approaches to nonstationary covariance functions over sphereshttps://projecteuclid.org/euclid.ejs/1581476605<strong>Emilio Porcu</strong>, <strong>Rachid Senoussi</strong>, <strong>Enner Mendoza</strong>, <strong>Moreno Bevilacqua</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 890--916.</p><p><strong>Abstract:</strong><br/>
The paper considers reduction problems and deformation approaches for nonstationary covariance functions on the $(d-1)$-dimensional spheres, $\mathbb{S}^{d-1}$, embedded in the $d$-dimensional Euclidean space. Given a covariance function $C$ on $\mathbb{S}^{d-1}$, we chase a pair $(R,\Psi)$, for a function $R:[-1,+1]\to \mathbb{R}$ and a smooth bijection $\Psi$, such that $C$ can be reduced to a geodesically isotropic one: $C(\mathbf{x},\mathbf{y})=R(\langle \Psi (\mathbf{x}),\Psi (\mathbf{y})\rangle )$, with $\langle \cdot ,\cdot \rangle $ denoting the dot product.
The problem finds motivation in recent statistical literature devoted to the analysis of global phenomena, defined typically over the sphere of $\mathbb{R}^{3}$. The application domains considered in the manuscript makes the problem mathematically challenging. We show the uniqueness of the representation in the reduction problem. Then, under some regularity assumptions, we provide an inversion formula to recover the bijection $\Psi$, when it exists, for a given $C$. We also give sufficient conditions for reducibility.
</p>projecteuclid.org/euclid.ejs/1581476605_20200512220226Tue, 12 May 2020 22:02 EDTGeneralized bounds for active subspaceshttps://projecteuclid.org/euclid.ejs/1581995159<strong>Mario Teixeira Parente</strong>, <strong>Jonas Wallin</strong>, <strong>Barbara Wohlmuth</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 917--943.</p><p><strong>Abstract:</strong><br/>
In this article, we consider scenarios in which traditional estimates for the active subspace method based on probabilistic Poincaré inequalities are not valid due to unbounded Poincaré constants. Consequently, we propose a framework that allows to derive generalized estimates in the sense that it enables to control the trade-off between the size of the Poincaré constant and a weaker order of the final error bound. In particular, we investigate independently exponentially distributed random variables in dimension two or larger and give explicit expressions for corresponding Poincaré constants showing their dependence on the dimension of the problem. Finally, we suggest possibilities for future work that aim for extending the class of distributions applicable to the active subspace method as we regard this as an opportunity to enlarge its usability.
</p>projecteuclid.org/euclid.ejs/1581995159_20200512220226Tue, 12 May 2020 22:02 EDTOn the distribution, model selection properties and uniqueness of the Lasso estimator in low and high dimensionshttps://projecteuclid.org/euclid.ejs/1581995160<strong>Karl Ewald</strong>, <strong>Ulrike Schneider</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 944--969.</p><p><strong>Abstract:</strong><br/>
We derive expressions for the finite-sample distribution of the Lasso estimator in the context of a linear regression model in low as well as in high dimensions by exploiting the structure of the optimization problem defining the estimator. In low dimensions, we assume full rank of the regressor matrix and present expressions for the cumulative distribution function as well as the densities of the absolutely continuous parts of the estimator. Our results are presented for the case of normally distributed errors, but do not hinge on this assumption and can easily be generalized. Additionally, we establish an explicit formula for the correspondence between the Lasso and the least-squares estimator. We derive analogous results for the distribution in less explicit form in high dimensions where we make no assumptions on the regressor matrix at all. In this setting, we also investigate the model selection properties of the Lasso and show that possibly only a subset of models might be selected by the estimator, completely independently of the observed response vector. Finally, we present a condition for uniqueness of the estimator that is necessary as well as sufficient.
</p>projecteuclid.org/euclid.ejs/1581995160_20200512220226Tue, 12 May 2020 22:02 EDTConditional density estimation with covariate measurement errorhttps://projecteuclid.org/euclid.ejs/1582167984<strong>Xianzheng Huang</strong>, <strong>Haiming Zhou</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 970--1023.</p><p><strong>Abstract:</strong><br/>
We consider estimating the density of a response conditioning on an error-prone covariate. Motivated by two existing kernel density estimators in the absence of covariate measurement error, we propose a method to correct the existing estimators for measurement error. Asymptotic properties of the resultant estimators under different types of measurement error distributions are derived. Moreover, we adjust bandwidths readily available from existing bandwidth selection methods developed for error-free data to obtain bandwidths for the new estimators. Extensive simulation studies are carried out to compare the proposed estimators with naive estimators that ignore measurement error, which also provide empirical evidence for the effectiveness of the proposed bandwidth selection methods. A real-life data example is used to illustrate implementation of these methods under practical scenarios. An R package, lpme, is developed for implementing all considered methods, which we demonstrate via an R code example in Appendix B.2.
</p>projecteuclid.org/euclid.ejs/1582167984_20200512220226Tue, 12 May 2020 22:02 EDTParametric inference for diffusions observed at stopping timeshttps://projecteuclid.org/euclid.ejs/1589335308<strong>Emmanuel Gobet</strong>, <strong>Uladzislau Stazhynski</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2098--2122.</p><p><strong>Abstract:</strong><br/>
In this paper we study the problem of parametric inference for multidimensional diffusions based on observations at random stopping times. We work in the asymptotic framework of high frequency data over a fixed horizon. Previous works on the subject (such as [10, 17, 19, 5] among others) consider only deterministic, strongly predictable or random, independent of the process, observation times, and do not cover our setting. Under mild assumptions we construct a consistent sequence of estimators, for a large class of stopping time observation grids (studied in [20, 23]). Further we carry out the asymptotic analysis of the estimation error and establish a Central Limit Theorem (CLT) with a mixed Gaussian limit. In addition, in the case of a 1-dimensional parameter, for any sequence of estimators verifying CLT conditions without bias, we prove a uniform a.s. lower bound on the asymptotic variance, and show that this bound is sharp.
</p>projecteuclid.org/euclid.ejs/1589335308_20200512220226Tue, 12 May 2020 22:02 EDTOn the power of axial tests of uniformity on sphereshttps://projecteuclid.org/euclid.ejs/1589335309<strong>Christine Cutting</strong>, <strong>Davy Paindaveine</strong>, <strong>Thomas Verdebout</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2123--2154.</p><p><strong>Abstract:</strong><br/>
Testing uniformity on the $p$-dimensional unit sphere is arguably the most fundamental problem in directional statistics. In this paper, we consider this problem in the framework of axial data, that is, under the assumption that the $n$ observations at hand are randomly drawn from a distribution that charges antipodal regions equally. More precisely, we focus on axial, rotationally symmetric, alternatives and first address the problem under which the direction $\boldsymbol{\theta}$ of the corresponding symmetry axis is specified. In this setup, we obtain Le Cam optimal tests of uniformity, that are based on the sample covariance matrix (unlike their non-axial analogs, that are based on the sample average). For the more important unspecified-$\boldsymbol{\theta}$ problem, some classical tests are available in the literature, but virtually nothing is known on their non-null behavior. We therefore study the non-null behavior of the celebrated Bingham test and of other tests that exploit the single-spiked nature of the considered alternatives. We perform Monte Carlo exercises to investigate the finite-sample behavior of our tests and to show their agreement with our asymptotic results.
</p>projecteuclid.org/euclid.ejs/1589335309_20200512220226Tue, 12 May 2020 22:02 EDTProjective inference in high-dimensional problems: Prediction and feature selectionhttps://projecteuclid.org/euclid.ejs/1589335310<strong>Juho Piironen</strong>, <strong>Markus Paasiniemi</strong>, <strong>Aki Vehtari</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2155--2197.</p><p><strong>Abstract:</strong><br/>
This paper reviews predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We demonstrate that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the reference model and the operation during the latter step as predictive projection . The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The key ideas are illustrated via several experiments using simulated and real world data.
</p>projecteuclid.org/euclid.ejs/1589335310_20200512220226Tue, 12 May 2020 22:02 EDTTesting goodness of fit for point processes via topological data analysishttps://projecteuclid.org/euclid.ejs/1582534816<strong>Christophe A. N. Biscio</strong>, <strong>Nicolas Chenavier</strong>, <strong>Christian Hirsch</strong>, <strong>Anne Marie Svane</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1024--1074.</p><p><strong>Abstract:</strong><br/>
We introduce tests for the goodness of fit of point patterns via methods from topological data analysis. More precisely, the persistent Betti numbers give rise to a bivariate functional summary statistic for observed point patterns that is asymptotically Gaussian in large observation windows. We analyze the power of tests derived from this statistic on simulated point patterns and compare its performance with global envelope tests. Finally, we apply the tests to a point pattern from an application context in neuroscience. As the main methodological contribution, we derive sufficient conditions for a functional central limit theorem on bounded persistent Betti numbers of point processes with exponential decay of correlations.
</p>projecteuclid.org/euclid.ejs/1582534816_20200514040059Thu, 14 May 2020 04:00 EDTA general drift estimation procedure for stochastic differential equations with additive fractional noisehttps://projecteuclid.org/euclid.ejs/1582686016<strong>Fabien Panloup</strong>, <strong>Samy Tindel</strong>, <strong>Maylis Varvenne</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1075--1136.</p><p><strong>Abstract:</strong><br/>
In this paper we consider the drift estimation problem for a general differential equation driven by an additive multidimensional fractional Brownian motion, under ergodic assumptions on the drift coefficient. Our estimation procedure is based on the identification of the invariant measure, and we provide consistency results as well as some information about the convergence rate. We also give some examples of coefficients for which the identifiability assumption for the invariant measure is satisfied.
</p>projecteuclid.org/euclid.ejs/1582686016_20200514040059Thu, 14 May 2020 04:00 EDTSparsely observed functional time series: estimation and predictionhttps://projecteuclid.org/euclid.ejs/1582859034<strong>Tomáš Rubín</strong>, <strong>Victor M. Panaretos</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1137--1210.</p><p><strong>Abstract:</strong><br/>
Functional time series analysis, whether based on time or frequency domain methodology, has traditionally been carried out under the assumption of complete observation of the constituent series of curves, assumed stationary. Nevertheless, as is often the case with independent functional data, it may well happen that the data available to the analyst are not the actual sequence of curves, but relatively few and noisy measurements per curve, potentially at different locations in each curve’s domain. Under this sparse sampling regime, neither the established estimators of the time series’ dynamics nor their corresponding theoretical analysis will apply. The subject of this paper is to tackle the problem of estimating the dynamics and of recovering the latent process of smooth curves in the sparse regime. Assuming smoothness of the latent curves, we construct a consistent nonparametric estimator of the series’ spectral density operator and use it to develop a frequency-domain recovery approach, that predicts the latent curve at a given time by borrowing strength from the (estimated) dynamic correlations in the series across time. This new methodology is seen to comprehensively outperform a naive recovery approach that would ignore temporal dependence and use only methodology employed in the i.i.d. setting and hinging on the lag zero covariance. Further to predicting the latent curves from their noisy point samples, the method fills in gaps in the sequence (curves nowhere sampled), denoises the data, and serves as a basis for forecasting. Means of providing corresponding confidence bands are also investigated. A simulation study interestingly suggests that sparse observation for a longer time period may provide better performance than dense observation for a shorter period, in the presence of smoothness. The methodology is further illustrated by application to an environmental data set on fair-weather atmospheric electricity, which naturally leads to a sparse functional time series.
</p>projecteuclid.org/euclid.ejs/1582859034_20200514040059Thu, 14 May 2020 04:00 EDT$k$-means clustering of extremeshttps://projecteuclid.org/euclid.ejs/1583204487<strong>Anja Janßen</strong>, <strong>Phyllis Wan</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1211--1233.</p><p><strong>Abstract:</strong><br/>
The $k$-means clustering algorithm and its variant, the spherical $k$-means clustering, are among the most important and popular methods in unsupervised learning and pattern detection. In this paper, we explore how the spherical $k$-means algorithm can be applied in the analysis of only the extremal observations from a data set. By making use of multivariate extreme value analysis we show how it can be adopted to find “prototypes” of extremal dependence and derive a consistency result for our suggested estimator. In the special case of max-linear models we show furthermore that our procedure provides an alternative way of statistical inference for this class of models. Finally, we provide data examples which show that our method is able to find relevant patterns in extremal observations and allows us to classify extremal events.
</p>projecteuclid.org/euclid.ejs/1583204487_20200514040059Thu, 14 May 2020 04:00 EDTConsistency and asymptotic normality of Latent Block Model estimatorshttps://projecteuclid.org/euclid.ejs/1585015341<strong>Vincent Brault</strong>, <strong>Christine Keribin</strong>, <strong>Mahendra Mariadassou</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1234--1268.</p><p><strong>Abstract:</strong><br/>
The Latent Block Model (LBM) is a model-based method to cluster simultaneously the $d$ columns and $n$ rows of a data matrix. Parameter estimation in LBM is a difficult and multifaceted problem. Although various estimation strategies have been proposed and are now well understood empirically, theoretical guarantees about their asymptotic behavior is rather sparse and most results are limited to the binary setting. We prove here theoretical guarantees in the valued settings. We show that under some mild conditions on the parameter space, and in an asymptotic regime where $\log (d)/n$ and $\log (n)/d$ tend to $0$ when $n$ and $d$ tend to infinity, (1) the maximum-likelihood estimate of the complete model (with known labels) is consistent and (2) the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models. This equivalence allows us to transfer the asymptotic consistency, and under mild conditions, asymptotic normality, to the maximum likelihood estimate under the observed model. Moreover, the variational estimator is also consistent and, under the same conditions, asymptotically normal.
</p>projecteuclid.org/euclid.ejs/1585015341_20200514040059Thu, 14 May 2020 04:00 EDTDifferential network inference via the fused D-trace loss with cross variableshttps://projecteuclid.org/euclid.ejs/1585101683<strong>Yichong Wu</strong>, <strong>Tiejun Li</strong>, <strong>Xiaoping Liu</strong>, <strong>Luonan Chen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1269--1301.</p><p><strong>Abstract:</strong><br/>
Detecting the change of biological interaction networks is of great importance in biological and medical research. We proposed a simple loss function, named as CrossFDTL, to identify the network change or differential network by estimating the difference between two precision matrices under Gaussian assumption. The CrossFDTL is a natural fusion of the D-trace loss for the considered two networks by imposing the $\ell _{1}$ penalty to the differential matrix to ensure sparsity. The key point of our method is to utilize the cross variables, which correspond to the sum and difference of two precision matrices instead of using their original forms. Moreover, we developed an efficient minimization algorithm for the proposed loss function and further rigorously proved its convergence. Numerical results showed that our method outperforms the existing methods in both accuracy and convergence speed for the simulated and real data.
</p>projecteuclid.org/euclid.ejs/1585101683_20200514040059Thu, 14 May 2020 04:00 EDTRate optimal Chernoff bound and application to community detection in the stochastic block modelshttps://projecteuclid.org/euclid.ejs/1585101684<strong>Zhixin Zhou</strong>, <strong>Ping Li</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1302--1347.</p><p><strong>Abstract:</strong><br/>
The Chernoff coefficient is known to be an upper bound of Bayes error probability in classification problem. In this paper, we will develop a rate optimal Chernoff bound on the Bayes error probability. The new bound is not only an upper bound but also a lower bound of Bayes error probability up to a constant factor. Moreover, we will apply this result to community detection in the stochastic block models. As a clustering problem, the optimal misclassification rate of community detection problem can be characterized by our rate optimal Chernoff bound. This can be formalized by deriving a minimax error rate over certain parameter space of stochastic block models, then achieving such an error rate by a feasible algorithm employing multiple steps of EM type updates.
</p>projecteuclid.org/euclid.ejs/1585101684_20200514040059Thu, 14 May 2020 04:00 EDTComputing the degrees of freedom of rank-regularized estimators and cousinshttps://projecteuclid.org/euclid.ejs/1585274581<strong>Rahul Mazumder</strong>, <strong>Haolei Weng</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1348--1385.</p><p><strong>Abstract:</strong><br/>
Estimating a low rank matrix from its linear measurements is a problem of central importance in contemporary statistical analysis. The choice of tuning parameters for estimators remains an important challenge from a theoretical and practical perspective. To this end, Stein’s Unbiased Risk Estimate (SURE) framework provides a well-grounded statistical framework for degrees of freedom estimation. In this paper, we use the SURE framework to obtain degrees of freedom estimates for a general class of spectral regularized matrix estimators—our results generalize beyond the class of estimators that have been studied thus far. To this end, we use a result due to Shapiro (2002) pertaining to the differentiability of symmetric matrix valued functions, developed in the context of semidefinite optimization algorithms. We rigorously verify the applicability of Stein’s Lemma towards the derivation of degrees of freedom estimates; and also present new techniques based on Gaussian convolution to estimate the degrees of freedom of a class of spectral estimators, for which Stein’s Lemma does not directly apply.
</p>projecteuclid.org/euclid.ejs/1585274581_20200514040059Thu, 14 May 2020 04:00 EDTA fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variableshttps://projecteuclid.org/euclid.ejs/1585360818<strong>Ryoya Oda</strong>, <strong>Hirokazu Yanagihara</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1386--1412.</p><p><strong>Abstract:</strong><br/>
We put forward a variable selection method for selecting explanatory variables in a normality-assumed multivariate linear regression. It is cumbersome to calculate variable selection criteria for all subsets of explanatory variables when the number of explanatory variables is large. Therefore, we propose a fast and consistent variable selection method based on a generalized $C_{p}$ criterion. The consistency of the method is provided by a high-dimensional asymptotic framework such that the sample size and the sum of the dimensions of response vectors and explanatory vectors divided by the sample size tend to infinity and some positive constant which are less than one, respectively. Through numerical simulations, it is shown that the proposed method has a high probability of selecting the true subset of explanatory variables and is fast under a moderate sample size even when the number of dimensions is large.
</p>projecteuclid.org/euclid.ejs/1585360818_20200514040059Thu, 14 May 2020 04:00 EDTNonconcave penalized estimation in sparse vector autoregression modelhttps://projecteuclid.org/euclid.ejs/1585728014<strong>Xuening Zhu</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1413--1448.</p><p><strong>Abstract:</strong><br/>
High dimensional time series receive considerable attention recently, whose temporal and cross-sectional dependency could be captured by the vector autoregression (VAR) model. To tackle with the high dimensionality, penalization methods are widely employed. However, theoretically, the existing studies of the penalization methods mainly focus on $i.i.d$ data, therefore cannot quantify the effect of the dependence level on the convergence rate. In this work, we use the spectral properties of the time series to quantify the dependence and derive a nonasymptotic upper bound for the estimation errors. By focusing on the nonconcave penalization methods, we manage to establish the oracle properties of the penalized VAR model estimation by considering the effects of temporal and cross-sectional dependence. Extensive numerical studies are conducted to compare the finite sample performance using different penalization functions. Lastly, an air pollution data of mainland China is analyzed for illustration purpose.
</p>projecteuclid.org/euclid.ejs/1585728014_20200514040059Thu, 14 May 2020 04:00 EDTA Bayesian approach to disease clustering using restricted Chinese restaurant processeshttps://projecteuclid.org/euclid.ejs/1586397681<strong>Claudia Wehrhahn</strong>, <strong>Samuel Leonard</strong>, <strong>Abel Rodriguez</strong>, <strong>Tatiana Xifara</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1449--1478.</p><p><strong>Abstract:</strong><br/>
Identifying disease clusters (areas with an unusually high incidence of a particular disease) is a common problem in epidemiology and public health. We describe a Bayesian nonparametric mixture model for disease clustering that constrains clusters to be made of adjacent areal units. This is achieved by modifying the exchangeable partition probability function associated with the Ewen’s sampling distribution. We call the resulting prior the Restricted Chinese Restaurant Process, as the associated full conditional distributions resemble those associated with the standard Chinese Restaurant Process. The model is illustrated using synthetic data sets and in an application to oral cancer mortality in Germany.
</p>projecteuclid.org/euclid.ejs/1586397681_20200514040059Thu, 14 May 2020 04:00 EDTBeta-Binomial stick-breaking non-parametric priorhttps://projecteuclid.org/euclid.ejs/1586397682<strong>María F. Gil–Leyva</strong>, <strong>Ramsés H. Mena</strong>, <strong>Theodoros Nicoleris</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1479--1507.</p><p><strong>Abstract:</strong><br/>
A new class of nonparametric prior distributions, termed Beta-Binomial stick-breaking process, is proposed. By allowing the underlying length random variables to be dependent through a Beta marginals Markov chain, an appealing discrete random probability measure arises. The chain’s dependence parameter controls the ordering of the stick-breaking weights, and thus tunes the model’s label-switching ability. Also, by tuning this parameter, the resulting class contains the Dirichlet process and the Geometric process priors as particular cases, which is of interest for MCMC implementations.
Some properties of the model are discussed and a density estimation algorithm is proposed and tested with simulated datasets.
</p>projecteuclid.org/euclid.ejs/1586397682_20200514040059Thu, 14 May 2020 04:00 EDTEstimating piecewise monotone signalshttps://projecteuclid.org/euclid.ejs/1586397683<strong>Kentaro Minami</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1508--1576.</p><p><strong>Abstract:</strong><br/>
We study the problem of estimating piecewise monotone vectors. This problem can be seen as a generalization of the isotonic regression that allows a small number of order-violating changepoints. We focus mainly on the performance of the nearly-isotonic regression proposed by Tibshirani et al. (2011). We derive risk bounds for the nearly-isotonic regression estimators that are adaptive to piecewise monotone signals. The estimator achieves a near minimax convergence rate over certain classes of piecewise monotone signals under a weak assumption. Furthermore, we present an algorithm that can be applied to the nearly-isotonic type estimators on general weighted graphs. The simulation results suggest that the nearly-isotonic regression performs as well as the ideal estimator that knows the true positions of changepoints.
</p>projecteuclid.org/euclid.ejs/1586397683_20200514040059Thu, 14 May 2020 04:00 EDTAdaptive density estimation on bounded domains under mixing conditionshttps://projecteuclid.org/euclid.ejs/1589443236<strong>Karine Bertin</strong>, <strong>Nicolas Klutchnikoff</strong>, <strong>Jose R. Léon</strong>, <strong>Clémentine Prieur</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2198--2237.</p><p><strong>Abstract:</strong><br/>
In this article, we propose a new adaptive estimator for multivariate density functions defined on a bounded domain in the framework of multivariate mixing processes. Several procedures have been proposed in the literature to tackle the boundary bias issue encountered using classical kernel estimators. Most of them are designed to work in dimension $d=1$ or on the unit $d$-dimensional hypercube. We extend such results to more general bounded domains such as simple polygons or regular domains that satisfy a rolling condition. We introduce a specific family of kernel-type estimators devoid of boundary bias. We then propose a data-driven Goldenshluger and Lepski type procedure to jointly select a kernel and a bandwidth. We prove the optimality of our procedure in the adaptive framework, stating an oracle-type inequality. We illustrate the good behavior of our new class of estimators on simulated data. Finally, we apply our procedure to a real dataset.
</p>projecteuclid.org/euclid.ejs/1589443236_20200514040059Thu, 14 May 2020 04:00 EDTOn the predictive potential of kernel principal componentshttps://projecteuclid.org/euclid.ejs/1578020612<strong>Ben Jones</strong>, <strong>Andreas Artemiou</strong>, <strong>Bing Li</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1--23.</p><p><strong>Abstract:</strong><br/>
We give a probabilistic analysis of a phenomenon in statistics which, until recently, has not received a convincing explanation. This phenomenon is that the leading principal components tend to possess more predictive power for a response variable than lower-ranking ones despite the procedure being unsupervised. Our result, in its most general form, shows that the phenomenon goes far beyond the context of linear regression and classical principal components — if an arbitrary distribution for the predictor $X$ and an arbitrary conditional distribution for $Y\vert X$ are chosen then any measureable function $g(Y)$, subject to a mild condition, tends to be more correlated with the higher-ranking kernel principal components than with the lower-ranking ones. The “arbitrariness” is formulated in terms of unitary invariance then the tendency is explicitly quantified by exploring how unitary invariance relates to the Cauchy distribution. The most general results, for technical reasons, are shown for the case where the kernel space is finite dimensional. The occurency of this tendency in real world databases is also investigated to show that our results are consistent with observation.
</p>projecteuclid.org/euclid.ejs/1578020612_20200602220240Tue, 02 Jun 2020 22:02 EDTMonotone least squares and isotonic quantileshttps://projecteuclid.org/euclid.ejs/1578020615<strong>Alexandre Mösching</strong>, <strong>Lutz Dümbgen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 24--49.</p><p><strong>Abstract:</strong><br/>
We consider bivariate observations $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ such that, conditional on the $X_{i}$, the $Y_{i}$ are independent random variables. Precisely, the conditional distribution function of $Y_{i}$ equals $F_{X_{i}}$, where $(F_{x})_{x}$ is an unknown family of distribution functions. Under the sole assumption that $x\mapsto F_{x}$ is isotonic with respect to stochastic order, one can estimate $(F_{x})_{x}$ in two ways:
(i) For any fixed $y$ one estimates the antitonic function $x\mapsto F_{x}(y)$ via nonparametric monotone least squares, replacing the responses $Y_{i}$ with the indicators $1_{[Y_{i}\le y]}$.
(ii) For any fixed $\beta \in (0,1)$ one estimates the isotonic quantile function $x\mapsto F_{x}^{-1}(\beta)$ via a nonparametric version of regression quantiles.
We show that these two approaches are closely related, with (i) being more flexible than (ii). Then, under mild regularity conditions, we establish rates of convergence for the resulting estimators $\hat{F}_{x}(y)$ and $\hat{F}_{x}^{-1}(\beta)$, uniformly over $(x,y)$ and $(x,\beta)$ in certain rectangles as well as uniformly in $y$ or $\beta$ for a fixed $x$.
</p>projecteuclid.org/euclid.ejs/1578020615_20200602220240Tue, 02 Jun 2020 22:02 EDTNon-parametric adaptive estimation of order 1 Sobol indices in stochastic models, with an application to Epidemiologyhttps://projecteuclid.org/euclid.ejs/1578042013<strong>Gwenaëlle Castellan</strong>, <strong>Anthony Cousien</strong>, <strong>Viet Chi Tran</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 50--81.</p><p><strong>Abstract:</strong><br/>
Global sensitivity analysis is a set of methods aiming at quantifying the contribution of an uncertain input parameter of the model (or combination of parameters) on the variability of the response. We consider here the estimation of the Sobol indices of order 1 which are commonly-used indicators based on a decomposition of the output’s variance. In a deterministic framework, when the same inputs always give the same outputs, these indices are usually estimated by replicated simulations of the model. In a stochastic framework, when the response given a set of input parameters is not unique due to randomness in the model, metamodels are often used to approximate the mean and dispersion of the response by deterministic functions. We propose a new non-parametric estimator without the need of defining a metamodel to estimate the Sobol indices of order 1. The estimator is based on warped wavelets and is adaptive in the regularity of the model. The convergence of the mean square error to zero, when the number of simulations of the model tend to infinity, is computed and an elbow effect is shown, depending on the regularity of the model. Applications in Epidemiology are carried to illustrate the use of non-parametric estimators.
</p>projecteuclid.org/euclid.ejs/1578042013_20200602220240Tue, 02 Jun 2020 22:02 EDTModel-based clustering with envelopeshttps://projecteuclid.org/euclid.ejs/1578042014<strong>Wenjing Wang</strong>, <strong>Xin Zhang</strong>, <strong>Qing Mai</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 82--109.</p><p><strong>Abstract:</strong><br/>
Clustering analysis is an important unsupervised learning technique in multivariate statistics and machine learning. In this paper, we propose a set of new mixture models called CLEMM (in short for Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions and the nascent research area of envelope methodology. Formulated mostly for regression models, envelope methodology aims for simultaneous dimension reduction and efficient parameter estimation, and includes a very recent formulation of envelope discriminant subspace for classification and discriminant analysis. Motivated by the envelope discriminant subspace pursuit in classification, we consider parsimonious probabilistic mixture models where the cluster analysis can be improved by projecting the data onto a latent lower-dimensional subspace. The proposed CLEMM framework and the associated envelope-EM algorithms thus provide foundations for envelope methods in unsupervised and semi-supervised learning problems. Numerical studies on simulated data and two benchmark data sets show significant improvement of our propose methods over the classical methods such as Gaussian mixture models, K-means and hierarchical clustering algorithms. An R package is available at https://github.com/kusakehan/CLEMM.
</p>projecteuclid.org/euclid.ejs/1578042014_20200602220240Tue, 02 Jun 2020 22:02 EDTNonparametric false discovery rate control for identifying simultaneous signalshttps://projecteuclid.org/euclid.ejs/1578366075<strong>Sihai Dave Zhao</strong>, <strong>Yet Tien Nguyen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 110--142.</p><p><strong>Abstract:</strong><br/>
It is frequently of interest to identify simultaneous signals, defined as features that exhibit statistical significance across each of several independent experiments. For example, genes that are consistently differentially expressed across experiments in different animal species can reveal evolutionarily conserved biological mechanisms. However, in some problems the test statistics corresponding to these features can have complicated or unknown null distributions. This paper proposes a novel nonparametric false discovery rate control procedure that can identify simultaneous signals even without knowing these null distributions. The method is shown, theoretically and in simulations, to asymptotically control the false discovery rate. It was also used to identify genes that were both differentially expressed and proximal to differentially accessible chromatin in the brains of mice exposed to a conspecific intruder. The proposed method is available in the R package github.com/sdzhao/ssa.
</p>projecteuclid.org/euclid.ejs/1578366075_20200602220240Tue, 02 Jun 2020 22:02 EDTEfficient estimation in expectile regression using envelope modelshttps://projecteuclid.org/euclid.ejs/1578366076<strong>Tuo Chen</strong>, <strong>Zhihua Su</strong>, <strong>Yi Yang</strong>, <strong>Shanshan Ding</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 143--173.</p><p><strong>Abstract:</strong><br/>
As a generalization of the classical linear regression, expectile regression (ER) explores the relationship between the conditional expectile of a response variable and a set of predictor variables. ER with respect to different expectile levels can provide a comprehensive picture of the conditional distribution of the response variable given the predictors. We adopt an efficient estimation method called the envelope model ([8]) in ER, and construct a novel envelope expectile regression (EER) model. Estimation of the EER parameters can be performed using the generalized method of moments (GMM). We establish the consistency and derive the asymptotic distribution of the EER estimators. In addition, we show that the EER estimators are asymptotically more efficient than the ER estimators. Numerical experiments and real data examples are provided to demonstrate the efficiency gains attained by EER compared to ER, and the efficiency gains can further lead to improvements in prediction.
</p>projecteuclid.org/euclid.ejs/1578366076_20200602220240Tue, 02 Jun 2020 22:02 EDTAsymptotic seed bias in respondent-driven samplinghttps://projecteuclid.org/euclid.ejs/1586397684<strong>Yuling Yan</strong>, <strong>Bret Hanlon</strong>, <strong>Sebastien Roch</strong>, <strong>Karl Rohe</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1577--1610.</p><p><strong>Abstract:</strong><br/>
Respondent-driven sampling (RDS) collects a sample of individuals in a networked population by incentivizing the sampled individuals to refer their contacts into the sample. This iterative process is initialized from some seed node(s). Sometimes, this selection creates a large amount of seed bias. Other times, the seed bias is small. This paper gains a deeper understanding of this bias by characterizing its effect on the limiting distribution of various RDS estimators. Using classical tools and results from multi-type branching processes [12], we show that the seed bias is negligible for the Generalized Least Squares (GLS) estimator and non-negligible for both the inverse probability weighted and Volz-Heckathorn (VH) estimators. In particular, we show that (i) above a critical threshold, VH converge to a non-trivial mixture distribution, where the mixture component depends on the seed node, and the mixture distribution is possibly multi-modal. Moreover, (ii) GLS converges to a Gaussian distribution independent of the seed node, under a certain condition on the Markov process. Numerical experiments with both simulated data and empirical social networks suggest that these results appear to hold beyond the Markov conditions of the theorems.
</p>projecteuclid.org/euclid.ejs/1586397684_20200602220240Tue, 02 Jun 2020 22:02 EDTRandom distributions via Sequential Quantile Arrayhttps://projecteuclid.org/euclid.ejs/1586397685<strong>Annalisa Fabretti</strong>, <strong>Samantha Leorato</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1611--1647.</p><p><strong>Abstract:</strong><br/>
We propose a method to generate random distributions with known quantile distribution, or, more generally, with known distribution for some form of generalized quantile. The method takes inspiration from the random Sequential Barycenter Array distributions (SBA) proposed by Hill and Monticino (1998) which generates a Random Probability Measure (RPM) with known expected value. We define the Sequential Quantile Array (SQA) and show how to generate a random SQA from which we can derive RPMs. The distribution of the generated SQA-RPM can have full support and the RPMs can be both discrete, continuous and differentiable. We face also the problem of the efficient implementation of the procedure that ensures that the approximation of the SQA-RPM by a finite number of steps stays close to the SQA-RPM obtained theoretically by the procedure. Finally, we compare SQA-RPMs with similar approaches as Polya Tree.
</p>projecteuclid.org/euclid.ejs/1586397685_20200602220240Tue, 02 Jun 2020 22:02 EDTOn change-point estimation under Sobolev sparsityhttps://projecteuclid.org/euclid.ejs/1586397686<strong>Aurélie Fischer</strong>, <strong>Dominique Picard</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1648--1689.</p><p><strong>Abstract:</strong><br/>
In this paper, we consider the estimation of a change-point for possibly high-dimensional data in a Gaussian model, using a maximum likelihood method. We are interested in how dimension reduction can affect the performance of the method. We provide an estimator of the change-point that has a minimax rate of convergence, up to a logarithmic factor. The minimax rate is in fact composed of a fast rate —dimension-invariant— and a slow rate —increasing with the dimension. Moreover, it is proved that considering the case of sparse data, with a Sobolev regularity, there is a bound on the separation of the regimes above which there exists an optimal choice of dimension reduction, leading to the fast rate of estimation. We propose an adaptive dimension reduction procedure based on Lepski’s method and show that the resulting estimator attains the fast rate of convergence. Our results are then illustrated by a simulation study. In particular, practical strategies are suggested to perform dimension reduction.
</p>projecteuclid.org/euclid.ejs/1586397686_20200602220240Tue, 02 Jun 2020 22:02 EDTA fast MCMC algorithm for the uniform sampling of binary matrices with fixed marginshttps://projecteuclid.org/euclid.ejs/1586419218<strong>Guanyang Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1690--1706.</p><p><strong>Abstract:</strong><br/>
Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.
</p>projecteuclid.org/euclid.ejs/1586419218_20200602220240Tue, 02 Jun 2020 22:02 EDTPosterior contraction and credible sets for filaments of regression functionshttps://projecteuclid.org/euclid.ejs/1586916096<strong>Wei Li</strong>, <strong>Subhashis Ghosal</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1707--1743.</p><p><strong>Abstract:</strong><br/>
A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper supplements the current literature in two ways. First, we provide a Bayesian approach to the filament estimation in regression context and study the posterior contraction rates using a finite random series of B-splines basis. Compared with the kernel-estimation method, this has a theoretical advantage as the bias can be better controlled when the function is smoother, which allows obtaining better rates. Assuming that $f:\mathbb{R}^{2}\mapsto \mathbb{R}$ belongs to an isotropic Hölder class of order $\alpha \geq 4$, with the optimal choice of smoothing parameters, the posterior contraction rates for the filament points on some appropriately defined integral curves and for the Hausdorff distance of the filament are both $(n/\log n)^{(2-\alpha )/(2(1+\alpha ))}$. Secondly, we provide a way to construct a credible set with sufficient frequentist coverage for the filaments. We demonstrate the success of our proposed method in simulations and one application to earthquake data.
</p>projecteuclid.org/euclid.ejs/1586916096_20200602220240Tue, 02 Jun 2020 22:02 EDTSimultaneous transformation and rounding (STAR) models for integer-valued datahttps://projecteuclid.org/euclid.ejs/1586937696<strong>Daniel R. Kowal</strong>, <strong>Antonio Canale</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1744--1772.</p><p><strong>Abstract:</strong><br/>
We propose a simple yet powerful framework for modeling integer-valued data, such as counts, scores, and rounded data. The data-generating process is defined by Simultaneously Transforming and Rounding (STAR) a continuous-valued process, which produces a flexible family of integer-valued distributions capable of modeling zero-inflation, bounded or censored data, and over- or underdispersion. The transformation is modeled as unknown for greater distributional flexibility, while the rounding operation ensures a coherent integer-valued data-generating process. An efficient MCMC algorithm is developed for posterior inference and provides a mechanism for adaptation of successful Bayesian models and algorithms for continuous data to the integer-valued data setting. Using the STAR framework, we design a new Bayesian Additive Regression Tree model for integer-valued data, which demonstrates impressive predictive distribution accuracy for both synthetic data and a large healthcare utilization dataset. For interpretable regression-based inference, we develop a STAR additive model, which offers greater flexibility and scalability than existing integer-valued models. The STAR additive model is applied to study the recent decline in Amazon river dolphins.
</p>projecteuclid.org/euclid.ejs/1586937696_20200602220240Tue, 02 Jun 2020 22:02 EDTBias correction in conditional multivariate extremeshttps://projecteuclid.org/euclid.ejs/1587542553<strong>Mikael Escobar-Bach</strong>, <strong>Yuri Goegebeur</strong>, <strong>Armelle Guillou</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1773--1795.</p><p><strong>Abstract:</strong><br/>
We consider bias-corrected estimation of the stable tail dependence function in the regression context. To this aim, we first estimate the bias of a smoothed estimator of the stable tail dependence function, and then we subtract it from the estimator. The weak convergence, as a stochastic process, of the resulting asymptotically unbiased estimator of the conditional stable tail dependence function, correctly normalized, is established under mild assumptions, the covariate argument being fixed. The finite sample behaviour of our asymptotically unbiased estimator is then illustrated on a simulation study and compared to two alternatives, which are not bias corrected. Finally, our methodology is applied to a dataset of air pollution measurements.
</p>projecteuclid.org/euclid.ejs/1587542553_20200602220240Tue, 02 Jun 2020 22:02 EDTExact recovery in block spin Ising models at the critical linehttps://projecteuclid.org/euclid.ejs/1587693632<strong>Matthias Löwe</strong>, <strong>Kristina Schubert</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1796--1815.</p><p><strong>Abstract:</strong><br/>
We show how to exactly reconstruct the block structure at the critical line in the so-called Ising block model. This model was recently re-introduced by Berthet, Rigollet and Srivastava in [2]. There the authors show how to exactly reconstruct blocks away from the critical line and they give an upper and a lower bound on the number of observations one needs; thereby they establish a minimax optimal rate (up to constants). Our technique relies on a combination of their methods with fluctuation results obtained in [20]. The latter are extended to the full critical regime. We find that the number of necessary observations depends on whether the interaction parameter between two blocks is positive or negative: In the first case, there are about $N\log N$ observations required to exactly recover the block structure, while in the latter case $\sqrt{N}\log N$ observations suffice.
</p>projecteuclid.org/euclid.ejs/1587693632_20200602220240Tue, 02 Jun 2020 22:02 EDTConsistent nonparametric change point detection combining CUSUM and marked empirical processeshttps://projecteuclid.org/euclid.ejs/1591149719<strong>Maria Mohr</strong>, <strong>Natalie Neumeyer</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2238--2271.</p><p><strong>Abstract:</strong><br/>
A weakly dependent time series regression model with multivariate covariates and univariate observations is considered, for which we develop a procedure to detect whether the nonparametric conditional mean function is stable in time against change point alternatives. Our proposal is based on a modified CUSUM type test procedure, which uses a sequential marked empirical process of residuals. We show weak convergence of the considered process to a centered Gaussian process under the null hypothesis of no change in the mean function and a stationarity assumption. This requires some sophisticated arguments for sequential empirical processes of weakly dependent variables. As a consequence we obtain convergence of Kolmogorov-Smirnov and Cramér-von Mises type test statistics. The proposed procedure acquires a very simple limiting distribution and nice consistency properties, features from which related tests are lacking. We moreover suggest a bootstrap version of the procedure and discuss its applicability in the case of unstable variances.
</p>projecteuclid.org/euclid.ejs/1591149719_20200602220240Tue, 02 Jun 2020 22:02 EDT