Bernoulli Articles (Project Euclid)
http://projecteuclid.org/euclid.bj
The latest articles from Bernoulli on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2010 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Thu, 05 Aug 2010 15:41 EDTTue, 05 Apr 2011 09:14 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
A new method for obtaining sharp compound Poisson approximation error estimates for sums of locally dependent random variables
http://projecteuclid.org/euclid.bj/1274821072
<strong>Michael V. Boutsikas</strong>, <strong>Eutichia Vaggelatou</strong><p><strong>Source: </strong>Bernoulli, Volume 16, Number 2, 301--330.</p><p><strong>Abstract:</strong><br/>
Let X 1 , X 2 , …, X n be a sequence of independent or locally dependent random variables taking values in ℤ + . In this paper, we derive sharp bounds, via a new probabilistic method, for the total variation distance between the distribution of the sum ∑ i =1 n X i and an appropriate Poisson or compound Poisson distribution. These bounds include a factor which depends on the smoothness of the approximating Poisson or compound Poisson distribution. This “smoothness factor” is of order O( σ −2 ), according to a heuristic argument, where σ 2 denotes the variance of the approximating distribution. In this way, we offer sharp error estimates for a large range of values of the parameters. Finally, specific examples concerning appearances of rare runs in sequences of Bernoulli trials are presented by way of illustration.
</p>projecteuclid.org/euclid.bj/1274821072_Thu, 05 Aug 2010 15:41 EDTThu, 05 Aug 2010 15:41 EDTOn the isoperimetric constant, covariance inequalities and $L_{p}$-Poincaré inequalities in dimension onehttps://projecteuclid.org/euclid.bj/1560326428<strong>Adrien Saumard</strong>, <strong>Jon A. Wellner</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1794--1815.</p><p><strong>Abstract:</strong><br/>
First, we derive in dimension one a new covariance inequality of $L_{1}-L_{\infty}$ type that characterizes the isoperimetric constant as the best constant achieving the inequality. Second, we generalize our result to $L_{p}-L_{q}$ bounds for the covariance. Consequently, we recover Cheeger’s inequality without using the co-area formula. We also prove a generalized weighted Hardy type inequality that is needed to derive our covariance inequalities and that is of independent interest. Finally, we explore some consequences of our covariance inequalities for $L_{p}$-Poincaré inequalities and moment bounds. In particular, we obtain optimal constants in general $L_{p}$-Poincaré inequalities for measures with finite isoperimetric constant, thus generalizing in dimension one Cheeger’s inequality, which is a $L_{p}$-Poincaré inequality for $p=2$, to any real $p\geq1$.
</p>projecteuclid.org/euclid.bj/1560326428_20190612040036Wed, 12 Jun 2019 04:00 EDTA one-sample test for normality with kernel methodshttps://projecteuclid.org/euclid.bj/1560326429<strong>Jérémie Kellner</strong>, <strong>Alain Celisse</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1816--1837.</p><p><strong>Abstract:</strong><br/>
We propose a new one-sample test for normality in a Reproducing Kernel Hilbert Space (RKHS). Namely, we test the null-hypothesis of belonging to a given family of Gaussian distributions. Hence, our procedure may be applied either to test data for normality or to test parameters (mean and covariance) if data are assumed Gaussian. Our test is based on the same principle as the MMD (Maximum Mean Discrepancy) which is usually used for two-sample tests such as homogeneity or independence testing. Our method makes use of a special kind of parametric bootstrap (typical of goodness-of-fit tests) which is computationally more efficient than standard parametric bootstrap. Moreover, an upper bound for the Type-II error highlights the dependence on influential quantities. Experiments illustrate the practical improvement allowed by our test in high-dimensional settings where common normality tests are known to fail. We also consider an application to covariance rank selection through a sequential procedure.
</p>projecteuclid.org/euclid.bj/1560326429_20190612040036Wed, 12 Jun 2019 04:00 EDTCentral limit theorem for linear spectral statistics of large dimensional separable sample covariance matriceshttps://projecteuclid.org/euclid.bj/1560326430<strong>Zhidong Bai</strong>, <strong>Huiqin Li</strong>, <strong>Guangming Pan</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1838--1869.</p><p><strong>Abstract:</strong><br/>
Suppose that $\mathbf{X}_{n}=(x_{jk})$ is $N\times n$ whose elements are independent complex variables with mean zero, variance 1. The separable sample covariance matrix is defined as $\mathbf{B}_{n}=\frac{1}{N}\mathbf{T}_{2n}^{1/2}\mathbf{X}_{n}\mathbf{T}_{1n}\mathbf{X}_{n}^{*}\mathbf{T}_{2n}^{1/2}$ where $\mathbf{T}_{1n}$ is a Hermitian matrix and $\mathbf{T}_{2n}^{1/2}$ is a Hermitian square root of the nonnegative definite Hermitian matrix $\mathbf{T}_{2n}$. Its linear spectral statistics (LSS) are shown to have Gaussian limits when $n/N$ approaches a positive constant under some conditions.
</p>projecteuclid.org/euclid.bj/1560326430_20190612040036Wed, 12 Jun 2019 04:00 EDTAsymptotically efficient estimators for self-similar stationary Gaussian noises under high frequency observationshttps://projecteuclid.org/euclid.bj/1560326431<strong>Masaaki Fukasawa</strong>, <strong>Tetsuya Takabatake</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1870--1900.</p><p><strong>Abstract:</strong><br/>
This paper proposes feasible asymptotically efficient estimators for a certain class of Gaussian noises with self-similarity and stationarity properties, which includes the fractional Gaussian noises, under high frequency observations. In this setting, the optimal rate of estimation depends on whether either the Hurst or diffusion parameters is known or not. This is due to the singularity of the asymptotic Fisher information matrix for simultaneous estimation of the above two parameters. One of our key ideas is to extend the Whittle estimation method to the situation of high frequency observations. We show that our estimators are asymptotically efficient in Fisher’s sense. Further by Monte-Carlo experiments, we examine finite sample performances of our estimators. Finite sample modifications of the asymptotic variances of the estimators are also given, which exhibit almost perfect fits to the numerical results.
</p>projecteuclid.org/euclid.bj/1560326431_20190612040036Wed, 12 Jun 2019 04:00 EDTSparse covariance matrix estimation in high-dimensional deconvolutionhttps://projecteuclid.org/euclid.bj/1560326432<strong>Denis Belomestny</strong>, <strong>Mathias Trabs</strong>, <strong>Alexandre B. Tsybakov</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1901--1938.</p><p><strong>Abstract:</strong><br/>
We study the estimation of the covariance matrix $\Sigma$ of a $p$-dimensional normal random vector based on $n$ independent observations corrupted by additive noise. Only a general nonparametric assumption is imposed on the distribution of the noise without any sparsity constraint on its covariance matrix. In this high-dimensional semiparametric deconvolution problem, we propose spectral thresholding estimators that are adaptive to the sparsity of $\Sigma$. We establish an oracle inequality for these estimators under model miss-specification and derive non-asymptotic minimax convergence rates that are shown to be logarithmic in $n/\log p$. We also discuss the estimation of low-rank matrices based on indirect observations as well as the generalization to elliptical distributions. The finite sample performance of the threshold estimators is illustrated in a numerical example.
</p>projecteuclid.org/euclid.bj/1560326432_20190612040036Wed, 12 Jun 2019 04:00 EDTHybrid regularisation and the (in)admissibility of ridge regression in infinite dimensional Hilbert spaceshttps://projecteuclid.org/euclid.bj/1560326433<strong>Anirvan Chakraborty</strong>, <strong>Victor M. Panaretos</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1939--1976.</p><p><strong>Abstract:</strong><br/>
We consider the problem of estimating the slope function in a functional regression with a scalar response and a functional covariate. This central problem of functional data analysis is well known to be ill-posed, thus requiring a regularised estimation procedure. The two most commonly used approaches are based on spectral truncation or Tikhonov regularisation of the empirical covariance operator. In principle, Tikhonov regularisation is the more canonical choice. Compared to spectral truncation, it is robust to eigenvalue ties, while it attains the optimal minimax rate of convergence in the mean squared sense, and not just in a concentration probability sense. In this paper, we show that, surprisingly, one can strictly improve upon the performance of the Tikhonov estimator in finite samples by means of a linear estimator, while retaining its stability and asymptotic properties by combining it with a form of spectral truncation. Specifically, we construct an estimator that additively decomposes the functional covariate by projecting it onto two orthogonal subspaces defined via functional PCA; it then applies Tikhonov regularisation to the one component, while leaving the other component unregularised. We prove that when the covariate is Gaussian, this hybrid estimator uniformly improves upon the MSE of the Tikhonov estimator in a non-asymptotic sense, effectively rendering it inadmissible. This domination is shown to also persist under discrete observation of the covariate function. The hybrid estimator is linear, straightforward to construct in practice, and with no computational overhead relative to the standard regularisation methods. By means of simulation, it is shown to furnish sizeable gains even for modest sample sizes.
</p>projecteuclid.org/euclid.bj/1560326433_20190612040036Wed, 12 Jun 2019 04:00 EDTConsistency of adaptive importance sampling and recycling schemeshttps://projecteuclid.org/euclid.bj/1560326434<strong>Jean-Michel Marin</strong>, <strong>Pierre Pudlo</strong>, <strong>Mohammed Sedki</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1977--1998.</p><p><strong>Abstract:</strong><br/>
Among Monte Carlo techniques, the importance sampling requires fine tuning of a proposal distribution, which is now fluently resolved through iterative schemes. Sequential adaptive algorithms have been proposed to calibrate the sampling distribution. Cornuet et al. [ Scand. J. Stat. 39 (2012) 798–812] provides a significant improvement in stability and effective sample size by the introduction of a recycling procedure. However, the consistency of such algorithms have been rarely tackled because of their complexity. Moreover, the recycling strategy of the AMIS estimator adds another difficulty and its consistency remains largely open. In this work, we prove the convergence of sequential adaptive sampling, with finite Monte Carlo sample size at each iteration, and consistency of recycling procedures. Contrary to Douc et al. [ Ann. Statist. 35 (2007) 420–448], results are obtained here in the asymptotic regime where the number of iterations is going to infinity while the number of drawings per iteration is a fixed, but growing sequence of integers. Hence, some of the results shed new light on adaptive population Monte Carlo algorithms in that last regime and give advices on how the sample sizes should be fixed.
</p>projecteuclid.org/euclid.bj/1560326434_20190612040036Wed, 12 Jun 2019 04:00 EDTOn posterior consistency of tail index for Bayesian kernel mixture modelshttps://projecteuclid.org/euclid.bj/1560326435<strong>Cheng Li</strong>, <strong>Lizhen Lin</strong>, <strong>David B. Dunson</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 1999--2028.</p><p><strong>Abstract:</strong><br/>
Asymptotic theory of tail index estimation has been studied extensively in the frequentist literature on extreme values, but rarely in the Bayesian context. We investigate whether popular Bayesian kernel mixture models are able to support heavy tailed distributions and consistently estimate the tail index. We show that posterior inconsistency in tail index is surprisingly common for both parametric and nonparametric mixture models. We then present a set of sufficient conditions under which posterior consistency in tail index can be achieved, and verify these conditions for Pareto mixture models under general mixing priors.
</p>projecteuclid.org/euclid.bj/1560326435_20190612040036Wed, 12 Jun 2019 04:00 EDTThe unusual properties of aggregated superpositions of Ornstein–Uhlenbeck type processeshttps://projecteuclid.org/euclid.bj/1560326436<strong>Danijel Grahovac</strong>, <strong>Nikolai N. Leonenko</strong>, <strong>Alla Sikorskii</strong>, <strong>Murad S. Taqqu</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2029--2050.</p><p><strong>Abstract:</strong><br/>
Superpositions of Ornstein–Uhlenbeck type (supOU) processes form a rich class of stationary processes with a flexible dependence structure. The asymptotic behavior of the integrated and partial sum supOU processes can be, however, unusual. Their cumulants and moments turn out to have an unexpected rate of growth. We identify the property of fast growth of moments or cumulants as intermittency . Many proofs are given in a supplemental article (Grahovac, Leonenko, Sikorskii and Taqqu (2018)).
</p>projecteuclid.org/euclid.bj/1560326436_20190612040036Wed, 12 Jun 2019 04:00 EDTGibbs–non-Gibbs transitions in the fuzzy Potts model with a Kac-type interaction: Closing the Ising gaphttps://projecteuclid.org/euclid.bj/1560326437<strong>Florian Henning</strong>, <strong>Richard C. Kraaij</strong>, <strong>Christof Külske</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2051--2074.</p><p><strong>Abstract:</strong><br/>
We complete the investigation of the Gibbs properties of the fuzzy Potts model on the $d$-dimensional torus with Kac interaction which was started by Jahnel and one of the authors in (Sharp thresholds for Gibbs-non-Gibbs transitions in the fuzzy Potts model with a Kac-type interaction (2017)). As our main result of the present paper, we extend the previous sharpness result of mean-field bounds to cover all possible cases of fuzzy transformations, allowing also for the occurrence of Ising classes (containing precisely two spin values). The closing of this previously left open Ising-gap involves an analytical argument showing uniqueness of minimizing profiles for certain non-homogeneous conditional variational problems.
</p>projecteuclid.org/euclid.bj/1560326437_20190612040036Wed, 12 Jun 2019 04:00 EDTRegularization, sparse recovery, and median-of-means tournamentshttps://projecteuclid.org/euclid.bj/1560326438<strong>Gábor Lugosi</strong>, <strong>Shahar Mendelson</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2075--2106.</p><p><strong>Abstract:</strong><br/>
We introduce a regularized risk minimization procedure for regression function estimation. The procedure is based on median-of-means tournaments, introduced by the authors in Lugosi and Mendelson (2018) and achieves near optimal accuracy and confidence under general conditions, including heavy-tailed predictor and response variables. It outperforms standard regularized empirical risk minimization procedures such as LASSO or SLOPE in heavy-tailed problems.
</p>projecteuclid.org/euclid.bj/1560326438_20190612040036Wed, 12 Jun 2019 04:00 EDTRoot-$n$ consistent estimation of the marginal density in semiparametric autoregressive time series modelshttps://projecteuclid.org/euclid.bj/1560326439<strong>Lionel Truquet</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2107--2136.</p><p><strong>Abstract:</strong><br/>
In this paper, we consider the problem of estimating the marginal density in some autoregressive time series models for which the conditional mean and variance have a parametric specification. Under some regularity conditions, we show that a kernel type estimate based on the residuals can be root-$n$ consistent even if the noise density is unknown. Our results substantially extend those existing in the literature. Our assumptions are carefully checked for some standard time series models such as ARMA or GARCH processes. Asymptotic expansion of our estimator is obtained by combining some martingale type arguments and a coupling method for time series which is of independent interest. We also study the uniform convergence of our estimator on compact intervals.
</p>projecteuclid.org/euclid.bj/1560326439_20190612040036Wed, 12 Jun 2019 04:00 EDTIntegration with respect to the non-commutative fractional Brownian motionhttps://projecteuclid.org/euclid.bj/1560326440<strong>Aurélien Deya</strong>, <strong>René Schott</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2137--2162.</p><p><strong>Abstract:</strong><br/>
We study the issue of integration with respect to the non-commutative fractional Brownian motion, that is the analog of the standard fractional Brownian motion in a non-commutative probability setting.
When the Hurst index $H$ of the process is stricly larger than $1/2$, integration can be handled through the so-called Young procedure. The situation where $H=1/2$ corresponds to the specific free case, for which an Itô-type approach is known to be possible.
When $H<1/2$, rough-path-type techniques must come into the picture, which, from a theoretical point of view, involves the use of some a-priori-defined Lévy area process. We show that such an object can indeed be “canonically” constructed for any $H\in(\frac{1}{4},\frac{1}{2})$. Finally, when $H\leq1/4$, we exhibit a similar non-convergence phenomenon as for the non-diagonal entries of the (classical) Lévy area above the standard fractional Brownian motion.
</p>projecteuclid.org/euclid.bj/1560326440_20190612040036Wed, 12 Jun 2019 04:00 EDTConstruction of marginally coupled designs by subspace theoryhttps://projecteuclid.org/euclid.bj/1560326441<strong>Yuanzhen He</strong>, <strong>C. Devon Lin</strong>, <strong>Fasheng Sun</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2163--2182.</p><p><strong>Abstract:</strong><br/>
Recent researches on designs for computer experiments with both qualitative and quantitative factors have advocated the use of marginally coupled designs. This paper proposes a general method of constructing such designs for which the designs for qualitative factors are multi-level orthogonal arrays and the designs for quantitative factors are Latin hypercubes with desirable space-filling properties. Two cases are introduced for which we can obtain the guaranteed low-dimensional space-filling property for quantitative factors. Theoretical results on the proposed constructions are derived. For practical use, some constructed designs for three-level qualitative factors are tabulated.
</p>projecteuclid.org/euclid.bj/1560326441_20190612040036Wed, 12 Jun 2019 04:00 EDTConsistency of Bayesian nonparametric inference for discretely observed jump diffusionshttps://projecteuclid.org/euclid.bj/1560326442<strong>Jere Koskela</strong>, <strong>Dario Spanò</strong>, <strong>Paul A. Jenkins</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2183--2205.</p><p><strong>Abstract:</strong><br/>
We introduce verifiable criteria for weak posterior consistency of Bayesian nonparametric inference for jump diffusions with unit diffusion coefficient and uniformly Lipschitz drift and jump coefficients in arbitrary dimension. The criteria are expressed in terms of coefficients of the SDEs describing the process, and do not depend on intractable quantities such as transition densities. We also show that priors built from discrete nets, wavelet expansions, and Dirichlet mixture models satisfy our conditions. This generalises known results by incorporating jumps into previous work on unit diffusions with uniformly Lipschitz drift coefficients.
</p>projecteuclid.org/euclid.bj/1560326442_20190612040036Wed, 12 Jun 2019 04:00 EDTOn the risk of convex-constrained least squares estimators under misspecificationhttps://projecteuclid.org/euclid.bj/1560326443<strong>Billy Fang</strong>, <strong>Adityanand Guntuboyina</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2206--2244.</p><p><strong>Abstract:</strong><br/>
We consider the problem of estimating the mean of a noisy vector. When the mean lies in a convex constraint set, the least squares projection of the random vector onto the set is a natural estimator. Properties of the risk of this estimator, such as its asymptotic behavior as the noise tends to zero, have been well studied. We instead study the behavior of this estimator under misspecification, that is, without the assumption that the mean lies in the constraint set. For appropriately defined notions of risk in the misspecified setting, we prove a generalization of a low noise characterization of the risk due to [ Found. Comput. Math. 16 (2016) 965–1029] in the case of a polyhedral constraint set. An interesting consequence of our results is that the risk can be much smaller in the misspecified setting than in the well-specified setting. We also discuss consequences of our result for isotonic regression.
</p>projecteuclid.org/euclid.bj/1560326443_20190612040036Wed, 12 Jun 2019 04:00 EDTA central limit theorem for the realised covariation of a bivariate Brownian semistationary processhttps://projecteuclid.org/euclid.bj/1560326444<strong>Andrea Granelli</strong>, <strong>Almut E.D. Veraart</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2245--2278.</p><p><strong>Abstract:</strong><br/>
This article presents a weak law of large numbers and a central limit theorem for the scaled realised covariation of a bivariate Brownian semistationary process. The novelty of our results lies in the fact that we derive the suitable asymptotic theory both in a multivariate setting and outside the classical semimartingale framework. The proofs rely heavily on recent developments in Malliavin calculus.
</p>projecteuclid.org/euclid.bj/1560326444_20190612040036Wed, 12 Jun 2019 04:00 EDTThe first order correction to harmonic measure for random walks of rotationally invariant step distributionhttps://projecteuclid.org/euclid.bj/1560326445<strong>Longmin Wang</strong>, <strong>KaiNan Xiang</strong>, <strong>Lang Zou</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2279--2300.</p><p><strong>Abstract:</strong><br/>
Let $D\subset\mathbb{R}^{d}\ (d\geq2)$ be an open simply-connected bounded domain with smooth boundary $\partial D$ and $\mathbf{0}=(0,\ldots,0)\in D$. Fix any rotationally invariant probability $\mu$ on closed unit ball $\{z\in\mathbb{R}^{d}:\vert z\vert\leq1\}$ with $\mu(\{\mathbf{0}\})<1$. Let $\{S_{n}^{\mu}\}_{n=0}^{\infty}$ be the random walk with step-distribution $\mu$ starting at $\mathbf{0}$. Denote by $\omega_{\delta}(\mathbf{0},\mathrm{d}z;D)$ the discrete harmonic measure for $\{\delta S_{n}^{\mu}\}_{n=0}^{\infty}\ (\delta>0)$ exiting from $D$, which is viewed as a probability on $\partial D$ by projecting suitably the first exiting point to $\partial D$. Denote by $\omega(\mathbf{0},\mathrm{d}z;D)$ the harmonic measure for the $d$-dimensional standard Brownian motion exiting from $D$. Then in the weak convergence topology, \begin{equation*}\lim_{\delta\rightarrow0}\frac{1}{\delta}\bigl[\omega_{\delta}(\mathbf{0} ,\mathrm{d}z;D)-\omega(\mathbf{0},\mathrm{d}z;D)\bigr]=c_{\mu}\rho_{D}(z)\,\vert \mathrm{d}z\vert ,\end{equation*} where $\rho_{D}(\cdot)$ is a smooth function depending on $D$ but not on $\mu$, $c_{\mu}$ is a constant depending only on $\mu$, and $|\mathrm{d}z|$ is the Lebesgue measure with respect to $\partial D$. Additionally, $\rho_{D}(z)$ is determined by the following equation: For any smooth function $g$ on $\partial D$, \begin{equation*}\int_{\partial D}g(z)\rho_{D}(z)\,\vert \mathrm{d}z\vert =\int_{\partial D}\frac{\partial f}{\partial\mathbf{n}_{z}}(z)H_{D}(\mathbf{0},z)\,\vert \mathrm{d}z\vert ,\end{equation*} where $f$ is the harmonic function in $D$ with boundary values given by $g$, $H_{D}(\mathbf{0},z)$ is the Poisson kernel and derivative $\frac{\partial f}{\partial\mathbf{n}_{z}}$ is with respect to the inward unit normal $\mathbf{n}_{z}$ at $z\in\partial D$.
</p>projecteuclid.org/euclid.bj/1560326445_20190612040036Wed, 12 Jun 2019 04:00 EDTGromov–Hausdorff–Prokhorov convergence of vertex cut-trees of $n$-leaf Galton–Watson treeshttps://projecteuclid.org/euclid.bj/1560326446<strong>Hui He</strong>, <strong>Matthias Winkel</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2301--2329.</p><p><strong>Abstract:</strong><br/>
In this paper, we study the vertex cut-trees of Galton–Watson trees conditioned to have $n$ leaves. This notion is a slight variation of Dieuleveut’s vertex cut-tree of Galton–Watson trees conditioned to have $n$ vertices. Our main result is a joint Gromov–Hausdorff–Prokhorov convergence in the finite variance case of the Galton–Watson tree and its vertex cut-tree to Bertoin and Miermont’s joint distribution of the Brownian CRT and its cut-tree. The methods also apply to the infinite variance case, but the problem to strengthen Dieuleveut’s and Bertoin and Miermont’s Gromov–Prokhorov convergence to Gromov–Hausdorff–Prokhorov remains open for their models conditioned to have $n$ vertices.
</p>projecteuclid.org/euclid.bj/1560326446_20190612040036Wed, 12 Jun 2019 04:00 EDTBayesian mode and maximum estimation and accelerated rates of contractionhttps://projecteuclid.org/euclid.bj/1560326447<strong>William Weimin Yoo</strong>, <strong>Subhashis Ghosal</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2330--2358.</p><p><strong>Abstract:</strong><br/>
We study the problem of estimating the mode and maximum of an unknown regression function in the presence of noise. We adopt the Bayesian approach by using tensor-product B-splines and endowing the coefficients with Gaussian priors. In the usual fixed-in-advanced sampling plan, we establish posterior contraction rates for mode and maximum and show that they coincide with the minimax rates for this problem. To quantify estimation uncertainty, we construct credible sets for these two quantities that have high coverage probabilities with optimal sizes. If one is allowed to collect data sequentially, we further propose a Bayesian two-stage estimation procedure, where a second stage posterior is built based on samples collected within a credible set constructed from a first stage posterior. Under appropriate conditions on the radius of this credible set, we can accelerate optimal contraction rates from the fixed-in-advanced setting to the minimax sequential rates. A simulation experiment shows that our Bayesian two-stage procedure outperforms single-stage procedure and also slightly improves upon a non-Bayesian two-stage procedure.
</p>projecteuclid.org/euclid.bj/1560326447_20190612040036Wed, 12 Jun 2019 04:00 EDTBootstrapping INAR modelshttps://projecteuclid.org/euclid.bj/1560326448<strong>Carsten Jentsch</strong>, <strong>Christian H. Weiß</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 3, 2359--2408.</p><p><strong>Abstract:</strong><br/>
Integer-valued autoregressive (INAR) models form a very useful class of processes to deal with time series of counts. Statistical inference in these models is commonly based on asymptotic theory, which is available only under additional parametric conditions and further restrictions on the model order. For general INAR models, such results are not available and might be cumbersome to derive. Hence, we investigate how the INAR model structure and, in particular, its similarity to classical autoregressive (AR) processes can be exploited to develop an asymptotically valid bootstrap procedure for INAR models. Although, in a common formulation, INAR models share the autocorrelation structure with AR models, it turns out that (a) consistent estimation of the INAR coefficients is not sufficient to compute proper ‘INAR residuals’ to formulate a valid model-based bootstrap scheme, and (b) a naïve application of an AR bootstrap will generally fail. Instead, we propose a general INAR-type bootstrap procedure and discuss parametric as well as semi-parametric implementations. The latter approach is based on a joint semi-parametric estimator of the INAR coefficients and the innovations’ distribution. Under mild regularity conditions, we prove bootstrap consistency of our procedure for statistics belonging to the class of functions of generalized means. In an extensive simulation study, we provide numerical evidence of our theoretical findings and illustrate the superiority of the proposed INAR bootstrap over some obvious competitors. We illustrate our method by an application to a real data set about iceberg orders for the Lufthansa stock.
</p>projecteuclid.org/euclid.bj/1560326448_20190612040036Wed, 12 Jun 2019 04:00 EDTScaling limit of random forests with prescribed degree sequenceshttps://projecteuclid.org/euclid.bj/1568362031<strong>Tao Lei</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2409--2438.</p><p><strong>Abstract:</strong><br/>
In this paper, we consider the random plane forest uniformly drawn from all possible plane forests with a given degree sequence. Under suitable conditions on the degree sequences, we consider the limit of a sequence of such forests with the number of vertices tends to infinity in terms of Gromov–Hausdorff–Prokhorov topology. This work falls into the general framework of showing convergence of random combinatorial structures to certain Gromov–Hausdorff scaling limits, described in terms of the Brownian Continuum Random Tree (BCRT), pioneered by the work of Aldous ( Ann. Probab. 19 (1991) 1–28; In Stochastic Analysis (Durham, 1990) (1991) 23–70 Cambridge Univ. Press; Ann. Probab. 21 (1993) 248–289). In fact, we identify the limiting random object as a sequence of random real trees encoded by excursions of some first passage bridges reflected at minimum. We establish such convergence by studying the associated Lukasiewicz walk of the degree sequences. In particular, our work is closely related to and uses the results from the recent work of Broutin and Marckert ( Random Structures Algorithms 44 (2014) 290–316) on scaling limit of random trees with prescribed degree sequences, and the work of Addario-Berry ( Random Structures Algorithms 41 (2012) 253–261) on tail bounds of the height of a random tree with prescribed degree sequence.
</p>projecteuclid.org/euclid.bj/1568362031_20190913040756Fri, 13 Sep 2019 04:07 EDTStationary distributions and convergence for Walsh diffusionshttps://projecteuclid.org/euclid.bj/1568362032<strong>Tomoyuki Ichiba</strong>, <strong>Andrey Sarantsev</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2439--2478.</p><p><strong>Abstract:</strong><br/>
A Walsh diffusion on Euclidean space moves along each ray from the origin, as a solution to a stochastic differential equation with certain drift and diffusion coefficients, as long as it stays away from the origin. As it hits the origin, it instantaneously chooses a new direction according to a given probability law, called the spinning measure. A special example is a real-valued diffusion with skew reflections at the origin. This process continuously (in the weak sense) depends on the spinning measure. We determine a stationary measure for such process, explore long-term convergence to this distribution and establish an explicit rate of exponential convergence.
</p>projecteuclid.org/euclid.bj/1568362032_20190913040756Fri, 13 Sep 2019 04:07 EDTThe eternal multiplicative coalescent encoding via excursions of Lévy-type processeshttps://projecteuclid.org/euclid.bj/1568362033<strong>Vlada Limic</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2479--2507.</p><p><strong>Abstract:</strong><br/>
The multiplicative coalescent is a mean-field Markov process in which any pair of blocks coalesces at rate proportional to the product of their masses. In Aldous and Limic ( Electron. J. Probab. 3 (1998) Paper no. 3) each extreme eternal version of the multiplicative coalescent was described in three different ways, one of which matched its (marginal) law to that of the ordered excursion lengths above past minima of a certain Lévy-type process.
Using a modification of the breadth-first-walk construction from Aldous ( Ann. Probab. 25 (1997) 812–854) and Aldous and Limic ( Electron. J. Probab. 3 (1998) Paper no. 3), and some new insight from the thesis by Uribe Bravo (Markovian bridges, Brownian excursions, and stochastic fragmentation and coalescence (2007) UNAM), this work settles an open problem (3) from Aldous ( Ann. Probab. 25 (1997) 812–854) in the more general context of Aldous and Limic ( Electron. J. Probab. 3 (1998) Paper no. 3). Informally speaking, each eternal version is entirely encoded by its Lévy-type process, and contrary to Aldous’ original intuition, the time for the multiplicative coalescent does correspond to the linear increase in the constant part of the drift of the Lévy-type process. In the “standard multiplicative coalescent” context of Aldous ( Ann. Probab. 25 (1997) 812–854), this result was first announced by Armendáriz in 2001, while its first published proof is due to Broutin and Marckert ( Probab. Theory Related Fields 166 (2016) 515–552), who simultaneously account for the process of excess (or surplus) edge counts.
</p>projecteuclid.org/euclid.bj/1568362033_20190913040756Fri, 13 Sep 2019 04:07 EDTSemiparametric estimation for isotropic max-stable space-time processeshttps://projecteuclid.org/euclid.bj/1568362034<strong>Sven Buhl</strong>, <strong>Richard A. Davis</strong>, <strong>Claudia Klüppelberg</strong>, <strong>Christina Steinkohl</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2508--2537.</p><p><strong>Abstract:</strong><br/>
Regularly varying space-time processes have proved useful to study extremal dependence in space-time data. We propose a semiparametric estimation procedure based on a closed form expression of the extremogram to estimate parametric models of extremal dependence functions. We establish the asymptotic properties of the resulting parameter estimates and propose subsampling procedures to obtain asymptotically correct confidence intervals. A simulation study shows that the proposed procedure works well for moderate sample sizes and is robust to small departures from the underlying model. Finally, we apply this estimation procedure to fitting a max-stable process to radar rainfall measurements in a region in Florida. Complementary results and some proofs of key results are presented together with the simulation study in the supplement [Buhl et al. (2018)].
</p>projecteuclid.org/euclid.bj/1568362034_20190913040756Fri, 13 Sep 2019 04:07 EDTLarge ball probabilities, Gaussian comparison and anti-concentrationhttps://projecteuclid.org/euclid.bj/1568362035<strong>Friedrich Götze</strong>, <strong>Alexey Naumov</strong>, <strong>Vladimir Spokoiny</strong>, <strong>Vladimir Ulyanov</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2538--2563.</p><p><strong>Abstract:</strong><br/>
We derive tight non-asymptotic bounds for the Kolmogorov distance between the probabilities of two Gaussian elements to hit a ball in a Hilbert space. The key property of these bounds is that they are dimension-free and depend on the nuclear (Schatten-one) norm of the difference between the covariance operators of the elements and on the norm of the mean shift. The obtained bounds significantly improve the bound based on Pinsker’s inequality via the Kullback–Leibler divergence. We also establish an anti-concentration bound for a squared norm of a non-centered Gaussian element in Hilbert space. The paper presents a number of examples motivating our results and applications of the obtained bounds to statistical inference and to high-dimensional CLT.
</p>projecteuclid.org/euclid.bj/1568362035_20190913040756Fri, 13 Sep 2019 04:07 EDTLimit theorems with rate of convergence under sublinear expectationshttps://projecteuclid.org/euclid.bj/1568362036<strong>Xiao Fang</strong>, <strong>Shige Peng</strong>, <strong>Qi-Man Shao</strong>, <strong>Yongsheng Song</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2564--2596.</p><p><strong>Abstract:</strong><br/>
Under the sublinear expectation $\mathbb{E}[\cdot]:=\mathop{\mathrm{sup}}_{\theta\in\Theta}E_{\theta}[\cdot]$ for a given set of linear expectations $\{E_{\theta}:\theta\in\Theta\}$, we establish a new law of large numbers and a new central limit theorem with rate of convergence. We present some interesting special cases and discuss a related statistical inference problem. We also give an approximation and a representation of the $G$-normal distribution, which was used as the limit in Peng’s (Law of large numbers and central limit theorem under nonlinear expectations (2007) Preprint) central limit theorem, in a probability space.
</p>projecteuclid.org/euclid.bj/1568362036_20190913040756Fri, 13 Sep 2019 04:07 EDTFunctional estimation and hypothesis testing in nonparametric boundary modelshttps://projecteuclid.org/euclid.bj/1568362037<strong>Markus Reiß</strong>, <strong>Martin Wahl</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2597--2619.</p><p><strong>Abstract:</strong><br/>
Consider a Poisson point process with unknown support boundary curve $g$, which forms a prototype of an irregular statistical model. We address the problem of estimating non-linear functionals of the form $\int\Phi(g(x))\,dx$. Following a nonparametric maximum-likelihood approach, we construct an estimator which is UMVU over Hölder balls and achieves the (local) minimax rate of convergence. These results hold under weak assumptions on $\Phi$ which are satisfied for $\Phi(u)=|u|^{p}$, $p\ge1$. As an application, we consider the problem of estimating the $L^{p}$-norm and derive the minimax separation rates in the corresponding nonparametric hypothesis testing problem. Structural differences to results for regular nonparametric models are discussed.
</p>projecteuclid.org/euclid.bj/1568362037_20190913040756Fri, 13 Sep 2019 04:07 EDTSharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distancehttps://projecteuclid.org/euclid.bj/1568362038<strong>Jonathan Weed</strong>, <strong>Francis Bach</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2620--2648.</p><p><strong>Abstract:</strong><br/>
The Wasserstein distance between two probability measures on a metric space is a measure of closeness with applications in statistics, probability, and machine learning. In this work, we consider the fundamental question of how quickly the empirical measure obtained from $n$ independent samples from $\mu$ approaches $\mu$ in the Wasserstein distance of any order. We prove sharp asymptotic and finite-sample results for this rate of convergence for general measures on general compact metric spaces. Our finite-sample results show the existence of multi-scale behavior, where measures can exhibit radically different rates of convergence as $n$ grows.
</p>projecteuclid.org/euclid.bj/1568362038_20190913040756Fri, 13 Sep 2019 04:07 EDTUniform sampling in a structured branching populationhttps://projecteuclid.org/euclid.bj/1568362039<strong>Aline Marguet</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2649--2695.</p><p><strong>Abstract:</strong><br/>
We are interested in the dynamic of a structured branching population where the trait of each individual moves according to a Markov process. The rate of division of each individual is a function of its trait and when a branching event occurs, the trait of the descendants at birth depends on the trait of the mother and on the number of descendants. In this article, we explicitly describe the penalized Markov process, named auxiliary process, corresponding to the dynamic of the trait of a “typical” individual by giving its associated infinitesimal generator. We prove a Many-to-One formula and a Many-to-One formula for forks. Furthermore, we prove that this auxiliary process characterizes exactly the process of the trait of a uniformly sampled individual in a large population approximation. We detail three examples of growth-fragmentation models: the linear growth model, the exponential growth model and the parasite infection model.
</p>projecteuclid.org/euclid.bj/1568362039_20190913040756Fri, 13 Sep 2019 04:07 EDTNonparametric Bayesian posterior contraction rates for scalar diffusions with high-frequency datahttps://projecteuclid.org/euclid.bj/1568362040<strong>Kweku Abraham</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2696--2728.</p><p><strong>Abstract:</strong><br/>
We consider inference in the scalar diffusion model $\,\mathrm{d}X_{t}=b(X_{t})\,\mathrm{d}t+\sigma(X_{t})\,\mathrm{d}W_{t}$ with discrete data $(X_{j\Delta_{n}})_{0\leq j\leq n}$, $n\to\infty$, $\Delta_{n}\to0$ and periodic coefficients. For $\sigma$ given, we prove a general theorem detailing conditions under which Bayesian posteriors will contract in $L^{2}$-distance around the true drift function $b_{0}$ at the frequentist minimax rate (up to logarithmic factors) over Besov smoothness classes. We exhibit natural nonparametric priors which satisfy our conditions. Our results show that the Bayesian method adapts both to an unknown sampling regime and to unknown smoothness.
</p>projecteuclid.org/euclid.bj/1568362040_20190913040756Fri, 13 Sep 2019 04:07 EDTA Benamou–Brenier formulation of martingale optimal transporthttps://projecteuclid.org/euclid.bj/1568362041<strong>Martin Huesmann</strong>, <strong>Dario Trevisan</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2729--2757.</p><p><strong>Abstract:</strong><br/>
We introduce a Benamou–Brenier formulation for the continuous-time martingale optimal transport problem as a weak length relaxation of its discrete-time counterpart. By the correspondence between classical martingale problems and Fokker–Planck equations, we obtain an equivalent PDE formulation for which basic properties such as existence, duality and geodesic equations can be analytically studied, yielding corresponding results for the stochastic formulation. In the one dimensional case, sufficient conditions for finiteness of the cost are also given and a link between geodesics and porous medium equations is partially investigated.
</p>projecteuclid.org/euclid.bj/1568362041_20190913040756Fri, 13 Sep 2019 04:07 EDTDickman approximation in simulation, summations and perpetuitieshttps://projecteuclid.org/euclid.bj/1568362042<strong>Chinmoy Bhattacharjee</strong>, <strong>Larry Goldstein</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2758--2792.</p><p><strong>Abstract:</strong><br/>
The generalized Dickman distribution $\mathcal{D}_{\theta}$ with parameter $\theta>0$ is the unique solution to the distributional equality $W=_{d}W^{*}$, where \begin{align} W ∗ = d U 1/θ (W+1),\tag{1} \end{align} with $W$ non-negative with probability one, $U\sim\mathcal{U}[0,1]$ independent of $W$, and $=_{d}$ denoting equality in distribution. These distributions appear in number theory, stochastic geometry, perpetuities and the study of algorithms. We obtain bounds in Wasserstein type distances between $\mathcal{D}_{\theta}$ and the distribution of \begin{eqnarray*}W_{n}=\frac{1}{n}\sum_{i=1}^{n}Y_{k}B_{k},\end{eqnarray*} where $B_{1},\ldots,B_{n},Y_{1},\ldots,Y_{n}$ are independent with $B_{k}$ distributed $\operatorname{Ber}(1/k)$ or $\mathcal{P}(\theta/k)$, $E[Y_{k}]=k$ and $\operatorname{Var}(Y_{k})=\sigma_{k}^{2}$, and provide an application to the minimal directed spanning tree in $\mathbb{R}^{2}$. We also provide bounds with optimal rates for the Dickman convergence of weighted sums, arising in probabilistic number theory, of the form \begin{eqnarray*}S_{n}=\frac{1}{\log(p_{n})}\sum_{k=1}^{n}X_{k}\log(p_{k}),\end{eqnarray*} where $(p_{k})_{k\ge1}$ is an enumeration of the prime numbers in increasing order and $X_{k}$ is geometric with parameter $(1-1/p_{k})$, Bernoulli with success probability $1/(1+p_{k})$ or Poisson with mean $\lambda_{k}$.
Lastly, we broaden the class of generalized Dickman distributions by studying the fixed points of the transformation \begin{eqnarray*}s(W^{*})=_{d}U^{1/\theta}s(W+1)\end{eqnarray*} generalizing (1), that allows the use of non-identity utility functions $s(\cdot)$ in Vervaat perpetuities. We obtain distributional bounds for recursive methods that can be used to simulate from this family.
</p>projecteuclid.org/euclid.bj/1568362042_20190913040756Fri, 13 Sep 2019 04:07 EDTSelf-normalized Cramér type moderate deviations for martingaleshttps://projecteuclid.org/euclid.bj/1568362043<strong>Xiequan Fan</strong>, <strong>Ion Grama</strong>, <strong>Quansheng Liu</strong>, <strong>Qi-Man Shao</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2793--2823.</p><p><strong>Abstract:</strong><br/>
Let $(X_{i},\mathcal{F}_{i})_{i\geq 1}$ be a sequence of martingale differences. Set $S_{n}=\sum_{i=1}^{n}X_{i}$ and $[S]_{n}=\sum_{i=1}^{n}X_{i}^{2}$. We prove a Cramér type moderate deviation expansion for $\mathbf{P}(S_{n}/\sqrt{[S]_{n}}\geq x)$ as $n\to +\infty $. Our results partly extend the earlier work of Jing, Shao and Wang ( Ann. Probab. 31 (2003) 2167–2215) for independent random variables.
</p>projecteuclid.org/euclid.bj/1568362043_20190913040756Fri, 13 Sep 2019 04:07 EDTA multivariate Berry–Esseen theorem with explicit constantshttps://projecteuclid.org/euclid.bj/1568362044<strong>Martin Raič</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2824--2853.</p><p><strong>Abstract:</strong><br/>
We provide a Lyapunov type bound in the multivariate central limit theorem for sums of independent, but not necessarily identically distributed random vectors. The error in the normal approximation is estimated for certain classes of sets, which include the class of measurable convex sets. The error bound is stated with explicit constants. The result is proved by means of Stein’s method. In addition, we improve the constant in the bound of the Gaussian perimeter of convex sets.
</p>projecteuclid.org/euclid.bj/1568362044_20190913040756Fri, 13 Sep 2019 04:07 EDTHigh-dimensional Bayesian inference via the unadjusted Langevin algorithmhttps://projecteuclid.org/euclid.bj/1568362045<strong>Alain Durmus</strong>, <strong>Éric Moulines</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2854--2882.</p><p><strong>Abstract:</strong><br/>
We consider in this paper the problem of sampling a high-dimensional probability distribution $\pi$ having a density w.r.t. the Lebesgue measure on $\mathbb{R}^{d}$, known up to a normalization constant $x\mapsto\pi(x)=\mathrm{e}^{-U(x)}/\int_{\mathbb{R}^{d}}\mathrm{e}^{-U(y)}\,\mathrm{d}y$. Such problem naturally occurs for example in Bayesian inference and machine learning. Under the assumption that $U$ is continuously differentiable, $\nabla U$ is globally Lipschitz and $U$ is strongly convex, we obtain non-asymptotic bounds for the convergence to stationarity in Wasserstein distance of order $2$ and total variation distance of the sampling method based on the Euler discretization of the Langevin stochastic differential equation, for both constant and decreasing step sizes. The dependence on the dimension of the state space of these bounds is explicit. The convergence of an appropriately weighted empirical measure is also investigated and bounds for the mean square error and exponential deviation inequality are reported for functions which are measurable and bounded. An illustration to Bayesian inference for binary regression is presented to support our claims.
</p>projecteuclid.org/euclid.bj/1568362045_20190913040756Fri, 13 Sep 2019 04:07 EDTA supermartingale approach to Gaussian process based sequential design of experimentshttps://projecteuclid.org/euclid.bj/1568362046<strong>Julien Bect</strong>, <strong>François Bachoc</strong>, <strong>David Ginsbourger</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2883--2919.</p><p><strong>Abstract:</strong><br/>
Gaussian process (GP) models have become a well-established framework for the adaptive design of costly experiments, and notably of computer experiments. GP-based sequential designs have been found practically efficient for various objectives, such as global optimization (estimating the global maximum or maximizer(s) of a function), reliability analysis (estimating a probability of failure) or the estimation of level sets and excursion sets. In this paper, we study the consistency of an important class of sequential designs, known as stepwise uncertainty reduction (SUR) strategies. Our approach relies on the key observation that the sequence of residual uncertainty measures, in SUR strategies, is generally a supermartingale with respect to the filtration generated by the observations. This observation enables us to establish generic consistency results for a broad class of SUR strategies. The consistency of several popular sequential design strategies is then obtained by means of this general result. Notably, we establish the consistency of two SUR strategies proposed by Bect, Ginsbourger, Li, Picheny and Vazquez ( Stat. Comput. 22 (2012) 773–793) – to the best of our knowledge, these are the first proofs of consistency for GP-based sequential design algorithms dedicated to the estimation of excursion sets and their measure. We also establish a new, more general proof of consistency for the expected improvement algorithm for global optimization which, unlike previous results in the literature, applies to any GP with continuous sample paths.
</p>projecteuclid.org/euclid.bj/1568362046_20190913040756Fri, 13 Sep 2019 04:07 EDTOn rate of convergence in non-central limit theoremshttps://projecteuclid.org/euclid.bj/1568362047<strong>Vo Anh</strong>, <strong>Nikolai Leonenko</strong>, <strong>Andriy Olenko</strong>, <strong>Volodymyr Vaskovych</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2920--2948.</p><p><strong>Abstract:</strong><br/>
The main result of this paper is the rate of convergence to Hermite-type distributions in non-central limit theorems. To the best of our knowledge, this is the first result in the literature on rates of convergence of functionals of random fields to Hermite-type distributions with ranks greater than 2. The results were obtained under rather general assumptions on the spectral densities of random fields. These assumptions are even weaker than in the known convergence results for the case of Rosenblatt distributions. Additionally, Lévy concentration functions for Hermite-type distributions were investigated.
</p>projecteuclid.org/euclid.bj/1568362047_20190913040756Fri, 13 Sep 2019 04:07 EDTOn logarithmically optimal exact simulation of max-stable and related random fields on a compact sethttps://projecteuclid.org/euclid.bj/1568362048<strong>Zhipeng Liu</strong>, <strong>Jose H. Blanchet</strong>, <strong>A.B. Dieker</strong>, <strong>Thomas Mikosch</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2949--2981.</p><p><strong>Abstract:</strong><br/>
We consider the random field \begin{equation*}M(t)=\mathop{\mathrm{sup}}_{n\geq1}\{-\log A_{n}+X_{n}(t)\},\qquad t\in T,\end{equation*} for a set $T\subset\mathbb{R}^{m}$, where $(X_{n})$ is an i.i.d. sequence of centered Gaussian random fields on $T$ and $0<A_{1}<A_{2}<\cdots$ are the arrivals of a general renewal process on $(0,\infty)$, independent of $(X_{n})$. In particular, a large class of max-stable random fields with Gumbel marginals have such a representation. Assume that one needs $c(d)=c(\{t_{1},\ldots,t_{d}\})$ function evaluations to sample $X_{n}$ at $d$ locations $t_{1},\ldots,t_{d}\in T$. We provide an algorithm which samples $M(t_{1}),\ldots,M(t_{d})$ with complexity $O(c(d)^{1+o(1)})$ as measured in the $L_{p}$ norm sense for any $p\ge1$. Moreover, if $X_{n}$ has an a.s. converging series representation, then $M$ can be a.s. approximated with error $\delta$ uniformly over $T$ and with complexity $O(1/(\delta\log(1/\delta))^{1/\alpha})$, where $\alpha$ relates to the Hölder continuity exponent of the process $X_{n}$ (so, if $X_{n}$ is Brownian motion, $\alpha=1/2$).
</p>projecteuclid.org/euclid.bj/1568362048_20190913040756Fri, 13 Sep 2019 04:07 EDTUniform rates of the Glivenko–Cantelli convergence and their use in approximating Bayesian inferenceshttps://projecteuclid.org/euclid.bj/1568362049<strong>Emanuele Dolera</strong>, <strong>Eugenio Regazzini</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 2982--3015.</p><p><strong>Abstract:</strong><br/>
This paper deals with suitable quantifications in approximating a probability measure by an “empirical” random probability measure $\hat{\mathfrak{p}}_{n}$, depending on the first $n$ terms of a sequence $\{\tilde{\xi}_{i}\}_{i\geq1}$ of random elements. Section 2 studies the range of oscillation near zero of the Wasserstein distance $\mathrm{d}^{(p)}_{[\mathbb{S}]}$ between $\mathfrak{p}_{0}$ and $\hat{\mathfrak{p}}_{n}$, assuming the $\tilde{\xi}_{i}$’s i.i.d. from $\mathfrak{p}_{0}$. In Theorem 2.1 $\mathfrak{p}_{0}$ can be fixed in the space of all probability measures on $(\mathbb{R}^{d},\mathscr{B}(\mathbb{R}^{d}))$ and $\hat{\mathfrak{p}}_{n}$ coincides with the empirical measure $\tilde{\mathfrak{e}}_{n}:=\frac{1}{n}\sum_{i=1}^{n}\delta_{\tilde{\xi}_{i}}$. In Theorem 2.2 (Theorem 2.3, respectively), $\mathfrak{p}_{0}$ is a $d$-dimensional Gaussian distribution (an element of a distinguished statistical exponential family, respectively) and $\hat{\mathfrak{p}}_{n}$ is another $d$-dimensional Gaussian distribution with estimated mean and covariance matrix (another element of the same family with an estimated parameter, respectively). These new results improve on allied recent works by providing also uniform bounds with respect to $n$, meaning the finiteness of the $p$-moment of $\mathop{\mathrm{sup}}_{n\geq1}b_{n}\mathrm{d}^{(p)}_{[\mathbb{S}]}(\mathfrak{p}_{0},\hat{\mathfrak{p}}_{n})$ is proved for some diverging sequence $b_{n}$ of positive numbers. In Section 3, assuming the $\tilde{\xi}_{i}$’s exchangeable, one studies the range of oscillation near zero of the Wasserstein distance between the conditional distribution – also called posterior – of the directing measure of the sequence, given $\tilde{\xi}_{1},\ldots,\tilde{\xi}_{n}$, and the point mass at $\hat{\mathfrak{p}}_{n}$. Similarly, a bound for the approximation of predictive distributions is given. Finally, Theorems from 3.3 to 3.5 reconsider Theorems from 2.1 to 2.3, respectively, according to a Bayesian perspective.
</p>projecteuclid.org/euclid.bj/1568362049_20190913040756Fri, 13 Sep 2019 04:07 EDTLocalized Gaussian width of $M$-convex hulls with applications to Lasso and convex aggregationhttps://projecteuclid.org/euclid.bj/1568362050<strong>Pierre C. Bellec</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3016--3040.</p><p><strong>Abstract:</strong><br/>
Upper and lower bounds are derived for the Gaussian mean width of a convex hull of $M$ points intersected with a Euclidean ball of a given radius. The upper bound holds for any collection of extreme points bounded in Euclidean norm. The upper bound and the lower bound match up to a multiplicative constant whenever the extreme points satisfy a one sided Restricted Isometry Property.
An appealing aspect of the upper bound is that no assumption on the covariance structure of the extreme points is needed. This aspect is especially useful to study regression problems with anisotropic design distributions. We provide applications of this bound to the Lasso estimator in fixed-design regression, the Empirical Risk Minimizer in the anisotropic persistence problem, and the convex aggregation problem in density estimation.
</p>projecteuclid.org/euclid.bj/1568362050_20190913040756Fri, 13 Sep 2019 04:07 EDTSignal detection via Phi-divergences for general mixtureshttps://projecteuclid.org/euclid.bj/1568362051<strong>Marc Ditzhaus</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3041--3068.</p><p><strong>Abstract:</strong><br/>
The family of goodness-of-fit tests based on $\Phi$-divergences is known to be optimal for detecting signals hidden in high-dimensional noise data when the heterogeneous normal mixture model is underlying. This test family includes Tukey’s popular higher criticism test and the famous Berk–Jones test. In this paper we address the open question whether the tests’ optimality is still present beyond the prime normal mixture model. On the one hand, we transfer the known optimality of the higher criticism test for different models, for example, for the heteroscedastic normal, general Gaussian and exponential-$\chi^{2}$-mixture models, to the whole test family. On the other hand, we discuss the optimality for new model classes based on exponential families including the scale exponential, the scale Fréchet and the location Gumbel models. For all these examples we apply a general machinery which might be used to show the tests’ optimality for further models/model classes in future.
</p>projecteuclid.org/euclid.bj/1568362051_20190913040756Fri, 13 Sep 2019 04:07 EDTSecond order Lyapunov exponents for parabolic and hyperbolic Anderson modelshttps://projecteuclid.org/euclid.bj/1568362052<strong>Raluca M. Balan</strong>, <strong>Jian Song</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3069--3089.</p><p><strong>Abstract:</strong><br/>
In this article, we consider the hyperbolic and parabolic Anderson models in arbitrary space dimension $d$, with constant initial condition, driven by a Gaussian noise which is white in time. We consider two spatial covariance structures: (i) the Fourier transform of the spectral measure of the noise is a non-negative locally-integrable function; (ii) $d=1$ and the noise is a fractional Brownian motion in space with index $1/4<H<1/2$. In both cases, we show that there is striking similarity between the Laplace transforms of the second moment of the solutions to these two models. Building on this connection and the recent powerful results of [ Ann. Inst. Henri Poincaré Probab. Stat. 53 (2017) 1305–1340] for the parabolic model, we compute the second order (upper) Lyapunov exponent for the hyperbolic model. In case (i), when the spatial covariance of the noise is given by the Riesz kernel, we present a unified method for calculating the second order Lyapunov exponents for the two models.
</p>projecteuclid.org/euclid.bj/1568362052_20190913040757Fri, 13 Sep 2019 04:07 EDT$\Phi$-entropy inequalities and asymmetric covariance estimates for convex measureshttps://projecteuclid.org/euclid.bj/1568362053<strong>Van Hoang Nguyen</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3090--3108.</p><p><strong>Abstract:</strong><br/>
In this paper, we use the semi-group method and an adaptation of the $L^{2}$-method of Hörmander to establish some $\Phi$-entropy inequalities and asymmetric covariance estimates for the strictly convex measures in $\mathbb{R}^{n}$. These inequalities extends the ones for the strictly log-concave measures to more general setting of convex measures. The $\Phi$-entropy inequalities are turned out to be sharp in the special case of Cauchy measures. Finally, we show that the similar inequalities for log-concave measures can be obtained from our results in the limiting case.
</p>projecteuclid.org/euclid.bj/1568362053_20190913040757Fri, 13 Sep 2019 04:07 EDTOn the geometric ergodicity of Hamiltonian Monte Carlohttps://projecteuclid.org/euclid.bj/1568362054<strong>Samuel Livingstone</strong>, <strong>Michael Betancourt</strong>, <strong>Simon Byrne</strong>, <strong>Mark Girolami</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3109--3138.</p><p><strong>Abstract:</strong><br/>
We establish general conditions under which Markov chains produced by the Hamiltonian Monte Carlo method will and will not be geometrically ergodic. We consider implementations with both position-independent and position-dependent integration times. In the former case, we find that the conditions for geometric ergodicity are essentially a gradient of the log-density which asymptotically points towards the centre of the space and grows no faster than linearly. In an idealised scenario in which the integration time is allowed to change in different regions of the space, we show that geometric ergodicity can be recovered for a much broader class of tail behaviours, leading to some guidelines for the choice of this free parameter in practice.
</p>projecteuclid.org/euclid.bj/1568362054_20190913040757Fri, 13 Sep 2019 04:07 EDTGaussian fluctuations for high-dimensional random projections of $\ell_{p}^{n}$-ballshttps://projecteuclid.org/euclid.bj/1568362055<strong>David Alonso-Gutiérrez</strong>, <strong>Joscha Prochno</strong>, <strong>Christoph Thäle</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3139--3174.</p><p><strong>Abstract:</strong><br/>
In this paper, we study high-dimensional random projections of $\ell_{p}^{n}$-balls. More precisely, for any $n\in\mathbb{N}$ let $E_{n}$ be a random subspace of dimension $k_{n}\in\{1,\ldots,n\}$ and $X_{n}$ be a random point in the unit ball of $\ell_{p}^{n}$. Our work provides a description of the Gaussian fluctuations of the Euclidean norm $\|P_{E_{n}}X_{n}\|_{2}$ of random orthogonal projections of $X_{n}$ onto $E_{n}$. In particular, under the condition that $k_{n}\to\infty$ it is shown that these random variables satisfy a central limit theorem, as the space dimension $n$ tends to infinity. Moreover, if $k_{n}\to\infty$ fast enough, we provide a Berry–Esseen bound on the rate of convergence in the central limit theorem. At the end, we provide a discussion of the large deviations counterpart to our central limit theorem.
</p>projecteuclid.org/euclid.bj/1568362055_20190913040757Fri, 13 Sep 2019 04:07 EDTOn Lasso refitting strategieshttps://projecteuclid.org/euclid.bj/1568362056<strong>Evgenii Chzhen</strong>, <strong>Mohamed Hebiri</strong>, <strong>Joseph Salmon</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3175--3200.</p><p><strong>Abstract:</strong><br/>
A well-known drawback of $\ell_{1}$-penalized estimators is the systematic shrinkage of the large coefficients towards zero. A simple remedy is to treat Lasso as a model-selection procedure and to perform a second refitting step on the selected support. In this work, we formalize the notion of refitting and provide oracle bounds for arbitrary refitting procedures of the Lasso solution. One of the most widely used refitting techniques which is based on Least-Squares may bring a problem of interpretability, since the signs of the refitted estimator might be flipped with respect to the original estimator. This problem arises from the fact that the Least-Squares refitting considers only the support of the Lasso solution, avoiding any information about signs or amplitudes. To this end, we define a sign consistent refitting as an arbitrary refitting procedure, preserving the signs of the first step Lasso solution and provide Oracle inequalities for such estimators. Finally, we consider special refitting strategies: Bregman Lasso and Boosted Lasso. Bregman Lasso has a fruitful property to converge to the Sign-Least-Squares refitting (Least-Squares with sign constraints), which provides with greater interpretability. We additionally study the Bregman Lasso refitting in the case of orthogonal design, providing with simple intuition behind the proposed method. Boosted Lasso, in contrast, considers information about magnitudes of the first Lasso step and allows to develop better oracle rates for prediction. Finally, we conduct an extensive numerical study to show advantages of one approach over others in different synthetic and semi-real scenarios.
</p>projecteuclid.org/euclid.bj/1568362056_20190913040757Fri, 13 Sep 2019 04:07 EDTCorrigendum: Analysis of the forward search using some new results for martingales and empirical processeshttps://projecteuclid.org/euclid.bj/1568362057<strong>Vanessa Berenguer-Rico</strong>, <strong>Søren Johansen</strong>, <strong>Bent Nielsen</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4A, 3201--3201.</p>projecteuclid.org/euclid.bj/1568362057_20190913040757Fri, 13 Sep 2019 04:07 EDTFunctional CLT for martingale-like nonstationary dependent structureshttps://projecteuclid.org/euclid.bj/1569398764<strong>Florence Merlevède</strong>, <strong>Magda Peligrad</strong>, <strong>Sergey Utev</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3203--3233.</p><p><strong>Abstract:</strong><br/>
In this paper, we develop non-stationary martingale techniques for dependent data. We shall stress the non-stationary version of the projective Maxwell–Woodroofe condition, which will be essential for obtaining maximal inequalities and functional central limit theorem for the following examples: nonstationary $\rho$-mixing sequences, functions of linear processes with non-stationary innovations, locally stationary processes, quenched version of the functional central limit theorem for a stationary sequence, evolutions in random media such as a process sampled by a shifted Markov chain.
</p>projecteuclid.org/euclid.bj/1569398764_20190925040636Wed, 25 Sep 2019 04:06 EDTRate of convergence to equilibrium for discrete-time stochastic dynamics with memoryhttps://projecteuclid.org/euclid.bj/1569398765<strong>Maylis Varvenne</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3234--3275.</p><p><strong>Abstract:</strong><br/>
The main objective of the paper is to study the long-time behavior of general discrete dynamics driven by an ergodic stationary Gaussian noise. In our main result, we prove existence and uniqueness of the invariant distribution and exhibit some upper-bounds on the rate of convergence to equilibrium in terms of the asymptotic behavior of the covariance function of the Gaussian noise (or equivalently to its moving average representation). Then, we apply our general results to fractional dynamics (including the Euler Scheme associated to fractional driven Stochastic Differential Equations). When the Hurst parameter $H$ belongs to $(0,1/2)$ we retrieve, with a slightly more explicit approach due to the discrete-time setting, the rate exhibited by Hairer in a continuous time setting ( Ann. Probab. 33 (2005) 703–758). In this fractional setting, we also emphasize the significant dependence of the rate of convergence to equilibrium on the local behaviour of the covariance function of the Gaussian noise.
</p>projecteuclid.org/euclid.bj/1569398765_20190925040636Wed, 25 Sep 2019 04:06 EDTLeast squares estimation in the monotone single index modelhttps://projecteuclid.org/euclid.bj/1569398766<strong>Fadoua Balabdaoui</strong>, <strong>Cécile Durot</strong>, <strong>Hanna Jankowski</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3276--3310.</p><p><strong>Abstract:</strong><br/>
We study the monotone single index model where a real response variable $Y$ is linked to a $d$-dimensional covariate $X$ through the relationship $E[Y|X]=\Psi_{0}(\alpha^{T}_{0}X)$, almost surely. Both the ridge function, $\Psi_{0}$, and the index parameter, $\alpha_{0}$, are unknown and the ridge function is assumed to be monotone. Under some appropriate conditions, we show that the rate of convergence in the $L_{2}$-norm for the least squares estimator of the bundled function $\Psi_{0}({\alpha}^{T}_{0}\cdot)$ is $n^{1/3}$. A similar result is established for the isolated ridge function, and the index is shown to converge at least at the rate $n^{1/3}$. Since the least squares estimator of the index is computationally intensive, we also consider alternative estimators of the index $\alpha_{0}$ from earlier literature. Moreover, we show that if the rate of convergence of such an alternative estimator is at least $n^{1/3}$, then the corresponding least-squares type estimators (obtained via a “plug-in” approach) of both the bundled and isolated ridge functions still converge at the rate $n^{1/3}$.
</p>projecteuclid.org/euclid.bj/1569398766_20190925040636Wed, 25 Sep 2019 04:06 EDTAdaptively weighted group Lasso for semiparametric quantile regression modelshttps://projecteuclid.org/euclid.bj/1569398767<strong>Toshio Honda</strong>, <strong>Ching-Kang Ing</strong>, <strong>Wei-Ying Wu</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3311--3338.</p><p><strong>Abstract:</strong><br/>
We propose an adaptively weighted group Lasso procedure for simultaneous variable selection and structure identification for varying coefficient quantile regression models and additive quantile regression models with ultra-high dimensional covariates. Under a strong sparsity condition, we establish selection consistency of the proposed Lasso procedure when the weights therein satisfy a set of general conditions. This consistency result, however, is reliant on a suitable choice of the tuning parameter for the Lasso penalty, which can be hard to make in practice. To alleviate this difficulty, we suggest a BIC-type criterion, which we call high-dimensional information criterion (HDIC), and show that the proposed Lasso procedure with the tuning parameter determined by HDIC still achieves selection consistency. Our simulation studies support strongly our theoretical findings.
</p>projecteuclid.org/euclid.bj/1569398767_20190925040636Wed, 25 Sep 2019 04:06 EDTNetworks of reinforced stochastic processes: Asymptotics for the empirical meanshttps://projecteuclid.org/euclid.bj/1569398768<strong>Giacomo Aletti</strong>, <strong>Irene Crimaldi</strong>, <strong>Andrea Ghiglietti</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3339--3378.</p><p><strong>Abstract:</strong><br/>
This work deals with systems of interacting reinforced stochastic processes , where each process $X^{j}=(X_{n,j})_{n}$ is located at a vertex $j$ of a finite weighted direct graph , and it can be interpreted as the sequence of “actions” adopted by an agent $j$ of the network. The interaction among the evolving dynamics of these processes depends on the weighted adjacency matrix $W$ associated to the underlying graph: indeed, the probability that an agent $j$ chooses a certain action depends on its personal “inclination” $Z_{n,j}$ and on the inclinations $Z_{n,h}$, with $h\neq j$, of the other agents according to the elements of $W$.
Asymptotic results for the stochastic processes of the personal inclinations $Z^{j}=(Z_{n,j})_{n}$ have been subject of studies in recent papers (e.g., Aletti, Crimaldi and Ghiglietti [ Ann. Appl. Probab. 27 (2017) 3787–3844], Crimaldi et al. [Synchronization and functional central limit theorems for interacting reinforced random walks (2019)]); while the asymptotic behavior of quantities based on the stochastic processes $X^{j}$ of the actions has never been studied yet. In this paper, we fill this gap by characterizing the asymptotic behavior of the empirical means $N_{n,j}=\sum_{k=1}^{n}X_{k,j}/n$, proving their almost sure synchronization and some central limit theorems in the sense of stable convergence. Moreover, we discuss some statistical applications of these convergence results concerning confidence intervals for the random limit toward which all the processes of the system almost surely converge and tools to make inference on the matrix $W$.
</p>projecteuclid.org/euclid.bj/1569398768_20190925040636Wed, 25 Sep 2019 04:06 EDTLimiting saddlepoint relative errors in large deviation regions under purely Tauberian conditionshttps://projecteuclid.org/euclid.bj/1569398769<strong>Ronald W. Butler</strong>, <strong>Andrew T.A. Wood</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3379--3399.</p><p><strong>Abstract:</strong><br/>
Most theoretical results on the relative errors of saddlepoint approximations in the extreme tails have involved placing conditions on the density/mass function. Checking the validity of such conditions is problematic when density/mass functions are intractable, as is typically the case in important practical applications involving convolved, compound, and first-passage distributions as well as for moment generating functions MGFs that are regularly varying. In this paper, we present novel conditions which ensure the existence of positive finite limiting relative errors for saddlepoint density/mass function and survival function approximations. These conditions, which are rather weak, are expressed entirely in terms of the MGF, hence the description purely Tauberian . We focus mainly on the cases in which there are positive and negative gamma distributional limits (the only other non-degenerate possibility being a Gaussian limit) and we show how to check the new conditions in important classes of models in these two settings.
</p>projecteuclid.org/euclid.bj/1569398769_20190925040636Wed, 25 Sep 2019 04:06 EDTRate of divergence of the nonparametric likelihood ratio test for Gaussian mixtureshttps://projecteuclid.org/euclid.bj/1569398770<strong>Wenhua Jiang</strong>, <strong>Cun-Hui Zhang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3400--3420.</p><p><strong>Abstract:</strong><br/>
We study a nonparametric likelihood ratio test (NPLRT) for Gaussian mixtures. It is based on the nonparametric maximum likelihood estimator in the context of demixing. The test concerns if a random sample is from the standard normal distribution. We consider mixing distributions of unbounded support for alternative hypothesis. We prove that the divergence rate of the NPLRT under the null is bounded by $\log n$, provided that the support range of the mixing distribution increases no faster than $(\log n/\log 9)^{1/2}$. We prove that the rate of $\sqrt{\log n}$ is a lower bound for the divergence rate if the support range increases no slower than the order of $\sqrt{\log n}$. Implications of the upper bound for the rate of divergence are discussed.
</p>projecteuclid.org/euclid.bj/1569398770_20190925040636Wed, 25 Sep 2019 04:06 EDTConcentration of weakly dependent Banach-valued sums and applications to statistical learning methodshttps://projecteuclid.org/euclid.bj/1569398772<strong>Gilles Blanchard</strong>, <strong>Oleksandr Zadorozhnyi</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3421--3458.</p><p><strong>Abstract:</strong><br/>
We obtain a Bernstein-type inequality for sums of Banach-valued random variables satisfying a weak dependence assumption of general type and under certain smoothness assumptions of the underlying Banach norm. We use this inequality in order to investigate in the asymptotical regime the error upper bounds for the broad family of spectral regularization methods for reproducing kernel decision rules, when trained on a sample coming from a $\tau$-mixing process.
</p>projecteuclid.org/euclid.bj/1569398772_20190925040636Wed, 25 Sep 2019 04:06 EDTNonparametric empirical Bayes improvement of shrinkage estimators with applications to time serieshttps://projecteuclid.org/euclid.bj/1569398773<strong>Eitan Greenshtein</strong>, <strong>Ariel Mantzura</strong>, <strong>Ya’acov Ritov</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3459--3478.</p><p><strong>Abstract:</strong><br/>
We consider the problem of estimating a vector ${\boldsymbol{\mu}}=(\mu_{1},\dots,\mu_{n})$ under a squared loss, based on independent observations $Y_{i}\sim N(\mu_{i},1)$, $i=1,\dots,n$, and possibly extra structural assumptions. We argue that many estimators are asymptotically equal to $\hat{\mu}_{i}=\alpha\tilde{\mu}_{i}+(1-\alpha)Y_{i}+\xi_{i}=\tilde{\mu}_{i}+(1-\alpha)(Y_{i}-\tilde{\mu}_{i})+\xi_{i}$, where $\alpha\in[0,1]$ and $\tilde{\mu}_{i}$ may depend on the data, but is not a function of $Y_{i}$, and $\sum\xi_{i}^{2}=o_{p}(n)$.
We consider the optimal estimator of the form $\tilde{\mu}_{i}+g(Y_{i}-\tilde{\mu}_{i})$ for a general, possibly random, function $g$, and approximate it using nonparametric empirical Bayes ideas and techniques. We consider both the retrospective and the sequential estimation problems. We elaborate and demonstrate our results on the case where $\hat{\mu}_{i}$ are Kalman filter estimators. Simulations and a real data analysis are also provided.
</p>projecteuclid.org/euclid.bj/1569398773_20190925040636Wed, 25 Sep 2019 04:06 EDTTwo-sided infinite-bin models and analyticity for Barak–Erdős graphshttps://projecteuclid.org/euclid.bj/1569398774<strong>Bastien Mallein</strong>, <strong>Sanjay Ramassamy</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3479--3495.</p><p><strong>Abstract:</strong><br/>
In this article, we prove that for any probability distribution $\mu $ on $\mathbb{N}$ one can construct a two-sided stationary version of the infinite-bin model – an interacting particle system introduced by Foss and Konstantopoulos – with move distribution $\mu $. Using this result, we obtain a new formula for the speed of the front of infinite-bin models, as a series of positive terms. This implies that the growth rate $C(p)$ of the longest path in a Barak–Erdős graph of parameter $p$ is analytic on $(0,1]$.
</p>projecteuclid.org/euclid.bj/1569398774_20190925040636Wed, 25 Sep 2019 04:06 EDTMoving block and tapered block bootstrap for functional time series with an application to the $K$-sample mean problemhttps://projecteuclid.org/euclid.bj/1569398775<strong>Dimitrios Pilavakis</strong>, <strong>Efstathios Paparoditis</strong>, <strong>Theofanis Sapatinas</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3496--3526.</p><p><strong>Abstract:</strong><br/>
We consider infinite-dimensional Hilbert space-valued random variables that are assumed to be temporal dependent in a broad sense. We prove a central limit theorem for the moving block bootstrap and for the tapered block bootstrap, and show that these block bootstrap procedures also provide consistent estimators of the long run covariance operator. Furthermore, we consider block bootstrap-based procedures for fully functional testing of the equality of mean functions between several independent functional time series. We establish validity of the block bootstrap methods in approximating the distribution of the statistic of interest under the null and show consistency of the block bootstrap-based tests under the alternative. The finite sample behaviour of the procedures is investigated by means of simulations. An application to a real-life dataset is also discussed.
</p>projecteuclid.org/euclid.bj/1569398775_20190925040636Wed, 25 Sep 2019 04:06 EDTBernstein-type exponential inequalities in survey sampling: Conditional Poisson sampling schemeshttps://projecteuclid.org/euclid.bj/1569398776<strong>Patrice Bertail</strong>, <strong>Stephan Clémençon</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3527--3554.</p><p><strong>Abstract:</strong><br/>
This paper is devoted to establishing exponential bounds for the probabilities of deviation of a sample sum from its expectation, when the variables involved in the summation are obtained by sampling in a finite population according to a rejective scheme, generalizing simple random sampling without replacement, and by using an appropriate normalization. In contrast to Poisson sampling, classical deviation inequalities in the i.i.d. setting do not straightforwardly apply to sample sums related to rejective schemes, due to the inherent dependence structure of the sampled points. We show here how to overcome this difficulty, by combining the formulation of rejective sampling as Poisson sampling conditioned upon the sample size with the Esscher transformation. In particular, the Bennett/Bernstein type bounds thus established highlight the effect of the asymptotic variance of the (properly standardized) sample weighted sum and are shown to be much more accurate than those based on the negative association property shared by the terms involved in the summation. Beyond its interest in itself, such a result for rejective sampling is crucial, insofar as it permit to obtain tail bounds for many other sampling schemes, namely those that can be accurately approximated by rejective plans in the sense of the total variation distance.
</p>projecteuclid.org/euclid.bj/1569398776_20190925040636Wed, 25 Sep 2019 04:06 EDTAsymptotic equivalence of fixed-size and varying-size determinantal point processeshttps://projecteuclid.org/euclid.bj/1569398777<strong>Simon Barthelmé</strong>, <strong>Pierre-Olivier Amblard</strong>, <strong>Nicolas Tremblay</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3555--3589.</p><p><strong>Abstract:</strong><br/>
Determinantal Point Processes (DPPs) are popular models for point processes with repulsion. They appear in numerous contexts, from physics to graph theory, and display appealing theoretical properties. On the more practical side of things, since DPPs tend to select sets of points that are some distance apart (repulsion), they have been advocated as a way of producing random subsets with high diversity. DPPs come in two variants: fixed-size and varying-size. A sample from a varying-size DPP is a subset of random cardinality, while in fixed-size “$k$-DPPs” the cardinality is fixed. The latter makes more sense in many applications, but unfortunately their computational properties are less attractive, since, among other things, inclusion probabilities are harder to compute. In this work, we show that as the size of the ground set grows, $k$-DPPs and DPPs become equivalent, in the sense that fixed-order inclusion probabilities converge. As a by-product, we obtain saddlepoint formulas for inclusion probabilities in $k$-DPPs. These turn out to be extremely accurate, and suffer less from numerical difficulties than exact methods do. Our results also suggest that $k$-DPPs and DPPs also have equivalent maximum likelihood estimators. Finally, we obtain results on asymptotic approximations of elementary symmetric polynomials which may be of independent interest.
</p>projecteuclid.org/euclid.bj/1569398777_20190925040636Wed, 25 Sep 2019 04:06 EDTThe eigenstructure of the sample covariance matrices of high-dimensional stochastic volatility models with heavy tailshttps://projecteuclid.org/euclid.bj/1569398778<strong>Johannes Heiny</strong>, <strong>Thomas Mikosch</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3590--3622.</p><p><strong>Abstract:</strong><br/>
We consider a $p$-dimensional time series where the dimension $p$ increases with the sample size $n$. The resulting data matrix $\mathbf{X}$ follows a stochastic volatility model: each entry consists of a positive random volatility term multiplied by an independent noise term. The volatility multipliers introduce dependence in each row and across the rows. We study the asymptotic behavior of the eigenvalues and eigenvectors of the sample covariance matrix $\mathbf{X}\mathbf{X}'$ under a regular variation assumption on the noise. In particular, we prove Poisson convergence for the point process of the centered and normalized eigenvalues and derive limit theory for functionals acting on them, such as the trace. We prove related results for stochastic volatility models with additional linear dependence structure and for stochastic volatility models where the time-varying volatility terms are extinguished with high probability when $n$ increases. We provide explicit approximations of the eigenvectors which are of a strikingly simple structure. The main tools for proving these results are large deviation theorems for heavy-tailed time series, advocating a unified approach to the study of the eigenstructure of heavy-tailed random matrices.
</p>projecteuclid.org/euclid.bj/1569398778_20190925040636Wed, 25 Sep 2019 04:06 EDTGaps and interleaving of point processes in sampling from a residual allocation modelhttps://projecteuclid.org/euclid.bj/1569398779<strong>Jim Pitman</strong>, <strong>Yuri Yakubovich</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3623--3651.</p><p><strong>Abstract:</strong><br/>
This article presents a limit theorem for the gaps $\widehat{G}_{i:n}:=X_{n-i+1:n}-X_{n-i:n}$ between order statistics $X_{1:n}\le\cdots\le X_{n:n}$ of a sample of size $n$ from a random discrete distribution on the positive integers $(P_{1},P_{2},\ldots)$ governed by a residual allocation model (also called a Bernoulli sieve) $P_{j}:=H_{j}\prod_{i=1}^{j-1}(1-H_{i})$ for a sequence of independent random hazard variables $H_{i}$ which are identically distributed according to some distribution of $H\in(0,1)$ such that $-\log(1-H)$ has a non-lattice distribution with finite mean $\mu_{\log}$. As $n\to\infty$ the finite dimensional distributions of the gaps $\widehat{G}_{i:n}$ converge to those of limiting gaps $G_{i}$ which are the numbers of points in a stationary renewal process with i.i.d. spacings $-\log(1-H_{j})$ between times $T_{i-1}$ and $T_{i}$ of births in a Yule process, that is $T_{i}:=\sum_{k=1}^{i}\varepsilon_{k}/k$ for a sequence of i.i.d. exponential variables $\varepsilon_{k}$ with mean 1. A consequence is that the mean of $\widehat{G}_{i:n}$ converges to the mean of $G_{i}$, which is $1/(i\mu_{\log})$. This limit theorem simplifies and extends a result of Gnedin, Iksanov and Roesler for the Bernoulli sieve.
</p>projecteuclid.org/euclid.bj/1569398779_20190925040636Wed, 25 Sep 2019 04:06 EDTHarmonic measure for biased random walk in a supercritical Galton–Watson treehttps://projecteuclid.org/euclid.bj/1569398780<strong>Shen Lin</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3652--3672.</p><p><strong>Abstract:</strong><br/>
We consider random walks $\lambda $-biased towards the root on a Galton–Watson tree, whose offspring distribution $(p_{k})_{k\geq 1}$ is non-degenerate and has finite mean $m>1$. In the transient regime $0<\lambda <m$, the loop-erased trajectory of the biased random walk defines the $\lambda $-harmonic ray, whose law is the $\lambda $-harmonic measure on the boundary of the Galton–Watson tree. We answer a question of Lyons, Pemantle and Peres (In Classical and Modern Branching Processes (Minneapolis, MN, 1994) (1997) 223–237 Springer) by showing that the $\lambda $-harmonic measure has a.s. strictly larger Hausdorff dimension than the visibility measure, which is the harmonic measure corresponding to the simple forward random walk. We also prove that the average number of children of the vertices along the $\lambda $-harmonic ray is a.s. bounded below by $m$ and bounded above by $m^{-1}\sum k^{2}p_{k}$. Moreover, at least for $0<\lambda \leq 1$, the average number of children of the vertices along the $\lambda $-harmonic ray is a.s. strictly larger than that of the $\lambda $-biased random walk trajectory. We observe that the latter is not monotone in the bias parameter $\lambda $.
</p>projecteuclid.org/euclid.bj/1569398780_20190925040636Wed, 25 Sep 2019 04:06 EDTIntegral expression for the stationary distribution of reflected Brownian motion in a wedgehttps://projecteuclid.org/euclid.bj/1569398781<strong>Sandro Franceschi</strong>, <strong>Kilian Raschel</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3673--3713.</p><p><strong>Abstract:</strong><br/>
For Brownian motion in a (two-dimensional) wedge with negative drift and oblique reflection on the axes, we derive an explicit formula for the Laplace transform of its stationary distribution (when it exists), in terms of Cauchy integrals and generalized Chebyshev polynomials. To that purpose, we solve a Carleman-type boundary value problem on a hyperbola, satisfied by the Laplace transforms of the boundary stationary distribution.
</p>projecteuclid.org/euclid.bj/1569398781_20190925040636Wed, 25 Sep 2019 04:06 EDTEquivalence of some subcritical properties in continuum percolationhttps://projecteuclid.org/euclid.bj/1569398782<strong>Jean-Baptiste Gouéré</strong>, <strong>Marie Théret</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3714--3733.</p><p><strong>Abstract:</strong><br/>
We consider the Boolean model on $\mathbb{R}^{d}$. We prove some equivalences between subcritical percolation properties. Let us introduce some notations to state one of these equivalences. Let $C$ denote the connected component of the origin in the Boolean model. Let $|C|$ denotes its volume. Let $\ell$ denote the maximal length of a chain of random balls from the origin. Under optimal integrability conditions on the radii, we prove that $\mathbb{E}(|C|)$ is finite if and only if there exists $A,B>0$ such that $\mathbb{P}(\ell\ge n)\le Ae^{-Bn}$ for all $n\ge1$.
</p>projecteuclid.org/euclid.bj/1569398782_20190925040636Wed, 25 Sep 2019 04:06 EDTEstimating the input of a Lévy-driven queue by Poisson sampling of the workload processhttps://projecteuclid.org/euclid.bj/1569398783<strong>Liron Ravner</strong>, <strong>Onno Boxma</strong>, <strong>Michel Mandjes</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3734--3761.</p><p><strong>Abstract:</strong><br/>
This paper aims at semi-parametrically estimating the input process to a Lévy-driven queue by sampling the workload process at Poisson times. We construct a method-of-moments based estimator for the Lévy process’ characteristic exponent. This method exploits the known distribution of the workload sampled at an exponential time, thus taking into account the dependence between subsequent samples. Verifiable conditions for consistency and asymptotic normality are provided, along with explicit expressions for the asymptotic variance. The method requires an intermediate estimation step of estimating a constant (related to both the input distribution and the sampling rate); this constant also features in the asymptotic analysis. For subordinator Lévy input, a partial MLE is constructed for the intermediate step and we show that it satisfies the consistency and asymptotic normality conditions. For general spectrally-positive Lévy input a biased estimator is proposed that only uses workload observations above some threshold; the bias can be made arbitrarily small by appropriately choosing the threshold.
</p>projecteuclid.org/euclid.bj/1569398783_20190925040636Wed, 25 Sep 2019 04:06 EDTEstimation of fully nonparametric transformation modelshttps://projecteuclid.org/euclid.bj/1569398784<strong>Benjamin Colling</strong>, <strong>Ingrid Van Keilegom</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3762--3795.</p><p><strong>Abstract:</strong><br/>
Consider the following nonparametric transformation model $\Lambda (Y)=m(X)+\varepsilon $, where $X$ is a $d$-dimensional covariate, $Y$ is a continuous univariate dependent variable and $\varepsilon $ is an error term with zero mean and which is independent of $X$. We assume that the unknown transformation $\Lambda $ is strictly increasing and that $m$ is an unknown regression function. Our goal is to develop two new nonparametric estimators of the transformation $\Lambda $, the first one based on the least squares loss and the second one based on the least absolute deviation loss, and to compare their performance with that of the estimators developed by Chiappori, Komunjer and Kristensen ( J. Econometrics 188 (2015) 22–39). Our proposed estimators are based on an estimator of the conditional distribution of $U$ given $X$, where $U$ is an appropriate transformation of $Y$ that is uniformly distributed. The main motivation for working with $U$ instead of $Y$ is that, in transformation models, the response $Y$ is often skewed with very long tails, and so kernel smoothing based on $Y$ does not work well. Hence, we expect to obtain better estimators if we pre-transform $Y$ before applying kernel smoothing. We establish the asymptotic normality of the two proposed estimators. We also carry out a simulation study to illustrate the performance of our estimators, to compare these new estimators with the ones of Chiappori, Komunjer and Kristensen ( J. Econometrics 188 (2015) 22–39) and to see under which model conditions which estimators behave the best.
</p>projecteuclid.org/euclid.bj/1569398784_20190925040636Wed, 25 Sep 2019 04:06 EDTLong-time heat kernel estimates and upper rate functions of Brownian motion type for symmetric jump processeshttps://projecteuclid.org/euclid.bj/1569398785<strong>Yuichi Shiozawa</strong>, <strong>Jian Wang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3796--3831.</p><p><strong>Abstract:</strong><br/>
Let $X$ be a symmetric jump process on $\mathbb{R}^{d}$ such that the corresponding jumping kernel $J(x,y)$ satisfies
\[J(x,y)\le\frac{c}{|x-y|^{d+2}\log^{1+\varepsilon}(e+|x-y|)}\] for all $x,y\in\mathbb{R}^{d}$ with $|x-y|\ge1$ and some constants $c,\varepsilon>0$. Under additional mild assumptions on $J(x,y)$ for $|x-y|<1$, we show that $C\sqrt{r\log\log r}$ with some constant $C>0$ is an upper rate function of the process $X$, which enjoys the same form as that for Brownian motions. The approach is based on heat kernel estimates of large time for the process $X$. As a by-product, we also obtain two-sided heat kernel estimates of large time for symmetric jump processes whose jumping kernels are comparable to
\[\frac{1}{|x-y|^{d+2+\varepsilon}}\] for all $x,y\in\mathbb{R}^{d}$ with $|x-y|\ge1$ and some constant $\varepsilon>0$.
</p>projecteuclid.org/euclid.bj/1569398785_20190925040636Wed, 25 Sep 2019 04:06 EDTConsistent estimation of the spectrum of trace class Data Augmentation algorithmshttps://projecteuclid.org/euclid.bj/1569398786<strong>Saptarshi Chakraborty</strong>, <strong>Kshitij Khare</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3832--3863.</p><p><strong>Abstract:</strong><br/>
Markov chain Monte Carlo is widely used in a variety of scientific applications to generate approximate samples from intractable distributions. A thorough understanding of the convergence and mixing properties of these Markov chains can be obtained by studying the spectrum of the associated Markov operator. While several methods to bound/estimate the second largest eigenvalue are available in the literature, very few general techniques for consistent estimation of the entire spectrum have been proposed. Existing methods for this purpose require the Markov transition density to be available in closed form, which is often not true in practice, especially in modern statistical applications. In this paper, we propose a novel method to consistently estimate the entire spectrum of a general class of Markov chains arising from a popular and widely used statistical approach known as Data Augmentation. The transition densities of these Markov chains can often only be expressed as intractable integrals. We illustrate the applicability of our method using real and simulated data.
</p>projecteuclid.org/euclid.bj/1569398786_20190925040636Wed, 25 Sep 2019 04:06 EDTPrincipal components analysis of regularly varying functionshttps://projecteuclid.org/euclid.bj/1569398787<strong>Piotr Kokoszka</strong>, <strong>Stilian Stoev</strong>, <strong>Qian Xiong</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3864--3882.</p><p><strong>Abstract:</strong><br/>
The paper is concerned with asymptotic properties of the principal components analysis of functional data. The currently available results assume the existence of the fourth moment. We develop analogous results in a setting which does not require this assumption. Instead, we assume that the observed functions are regularly varying. We derive the asymptotic distribution of the sample covariance operator and of the sample functional principal components. We obtain a number of results on the convergence of moments and almost sure convergence. We apply the new theory to establish the consistency of the regression operator in a functional linear model.
</p>projecteuclid.org/euclid.bj/1569398787_20190925040636Wed, 25 Sep 2019 04:06 EDTStructured matrix estimation and completionhttps://projecteuclid.org/euclid.bj/1569398788<strong>Olga Klopp</strong>, <strong>Yu Lu</strong>, <strong>Alexandre B. Tsybakov</strong>, <strong>Harrison H. Zhou</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3883--3911.</p><p><strong>Abstract:</strong><br/>
We study the problem of matrix estimation and matrix completion under a general framework. This framework includes several important models as special cases such as the Gaussian mixture model, mixed membership model, bi-clustering model and dictionary learning. We establish the optimal convergence rates in a minimax sense for estimation of the signal matrix under the Frobenius norm and under the spectral norm. As a consequence of our general result we obtain minimax optimal rates of convergence for various special models.
</p>projecteuclid.org/euclid.bj/1569398788_20190925040636Wed, 25 Sep 2019 04:06 EDTRademacher complexity for Markov chains: Applications to kernel smoothing and Metropolis–Hastingshttps://projecteuclid.org/euclid.bj/1569398789<strong>Patrice Bertail</strong>, <strong>François Portier</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3912--3938.</p><p><strong>Abstract:</strong><br/>
The concept of Rademacher complexity for independent sequences of random variables is extended to Markov chains. The proposed notion of “regenerative block Rademacher complexity” (of a class of functions) follows from renewal theory and allows to control the expected values of suprema (over the class of functions) of empirical processes based on Harris Markov chains as well as the excess probability. For classes of Vapnik–Chervonenkis type, bounds on the “regenerative block Rademacher complexity” are established. These bounds depend essentially on the sample size and the probability tails of the regeneration times. The proposed approach is employed to obtain convergence rates for the kernel density estimator of the stationary measure and to derive concentration inequalities for the Metropolis–Hastings algorithm.
</p>projecteuclid.org/euclid.bj/1569398789_20190925040636Wed, 25 Sep 2019 04:06 EDTInverse exponential decay: Stochastic fixed point equation and ARMA modelshttps://projecteuclid.org/euclid.bj/1569398790<strong>Krzysztof Burdzy</strong>, <strong>Bartosz Kołodziejek</strong>, <strong>Tvrtko Tadić</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3939--3977.</p><p><strong>Abstract:</strong><br/>
We study solutions to the stochastic fixed point equation $X\stackrel{d}{=}AX+B$ when the coefficients are nonnegative and $B$ is an “inverse exponential decay” ($\operatorname{IED}$) random variable. We provide theorems on the left tail of $X$ which complement well-known tail results of Kesten and Goldie. We generalize our results to ARMA processes with nonnegative coefficients whose noise terms are from the $\operatorname{IED}$ class. We describe the lower envelope for these ARMA processes.
</p>projecteuclid.org/euclid.bj/1569398790_20190925040636Wed, 25 Sep 2019 04:06 EDTWeighted Poincaré inequalities, concentration inequalities and tail bounds related to Stein kernels in dimension onehttps://projecteuclid.org/euclid.bj/1569398791<strong>Adrien Saumard</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 4B, 3978--4006.</p><p><strong>Abstract:</strong><br/>
We investigate links between the so-called Stein’s density approach in dimension one and some functional and concentration inequalities. We show that measures having a finite first moment and a density with connected support satisfy a weighted Poincaré inequality with the weight being the Stein kernel, that indeed exists and is unique in this case. Furthermore, we prove weighted log-Sobolev and asymmetric Brascamp–Lieb type inequalities related to Stein kernels. We also show that existence of a uniformly bounded Stein kernel is sufficient to ensure a positive Cheeger isoperimetric constant. Then we derive new concentration inequalities. In particular, we prove generalized Mills’ type inequalities when a Stein kernel is uniformly bounded and sub-gamma concentration for Lipschitz functions of a variable with a sub-linear Stein kernel. More generally, when some exponential moments are finite, the Laplace transform of the random variable of interest is shown to bounded from above by the Laplace transform of the Stein kernel. Along the way, we prove a general lemma for bounding the Laplace transform of a random variable, that may be of independent interest. We also provide density and tail formulas as well as tail bounds, generalizing previous results that where obtained in the context of Malliavin calculus.
</p>projecteuclid.org/euclid.bj/1569398791_20190925040636Wed, 25 Sep 2019 04:06 EDT