## Electronic Journal of Statistics

### Median confidence regions in a nonparametric model

#### Abstract

The nonparametric measurement error model (NMEM) postulates that $X_{i}=\Delta +\epsilon _{i},i=1,2,\ldots ,n;\Delta \in \Re$ with $\epsilon _{i},i=1,2,\ldots ,n$, IID from $F(\cdot )\in\mathfrak{F}_{c,0}$, where $\mathfrak{F}_{c,0}$ is the class of all continuous distributions with median $0$, so $\Delta$ is the median parameter of $X$. This paper deals with the problem of constructing a confidence region (CR) for $\Delta$ under the NMEM. Aside from the NMEM, the problem setting also arises in a variety of situations, including inference about the median lifetime of a complex system arising in engineering, reliability, biomedical, and public health settings, as well as in the economic arena such as when dealing with household income. Current methods of constructing CRs for $\Delta$ are discussed, including the $T$-statistic based CR and the Wilcoxon signed-rank statistic based CR, arguably the two default methods in applied work when a confidence interval about the center of a distribution is desired. A ‘bottom-to-top’ approach for constructing CRs is implemented, which starts by imposing reasonable invariance or equivariance conditions on the desired CRs, and then optimizing with respect to their mean contents on subclasses of $\mathfrak{F}_{c,0}$. This contrasts with the usual approach of using a pivotal quantity constructed from test statistics and/or estimators and then ‘pivoting’ to obtain the CR. Applications to a real car mileage data set and to Proschan’s famous air-conditioning data set are illustrated. Simulation studies to compare performances of the different CR methods were performed. Results of these studies indicate that the sign-statistic based CR and the optimal CR focused on symmetric distributions satisfy the confidence level requirement, though they tended to have higher contents; while three of the bootstrap-based CR procedures and one of the newly-developed adaptive CR tended to be a tad more liberal, but with smaller contents. A critical recommendation for practitioners is that, under the NMEM, the $T$-statistic based and Wilcoxon signed-rank statistic based CRs should not be used since they either have very degraded coverage probabilities or inflated contents under some of the allowable error distributions under the NMEM.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 2 (2019), 2348-2390.

Dates
First available in Project Euclid: 18 July 2019

https://projecteuclid.org/euclid.ejs/1563436820

Digital Object Identifier
doi:10.1214/19-EJS1577

Subjects
Primary: 62G15: Tolerance and confidence regions
Secondary: 62G09, 62G35

#### Citation

Peña, Edsel A.; Kim, Taeho. Median confidence regions in a nonparametric model. Electron. J. Statist. 13 (2019), no. 2, 2348--2390. doi:10.1214/19-EJS1577. https://projecteuclid.org/euclid.ejs/1563436820

#### References

• [1] Richard E. Barlow and Frank Proschan., Statistical theory of reliability and life testing. Holt, Rinehart and Winston, Inc., New York-Montreal, Que.-London, 1975. Probability models, International Series in Decision Processes, Series in Quantitative Methods for Decision Making.
• [2] Robert Bartels. The rank version of von Neumann’s Ratio Test for Randomness., Journal of the American Statistical Association, 77(377):40–46, 1982.
• [3] George Casella and Roger L. Berger., Statistical inference. The Wadsworth & Brooks/Cole Statistics/Probability Series. Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA, 1990.
• [4] A. C. Davison and D. V. Hinkley., Bootstrap methods and their application, volume 1 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 1997. With 1 IBM-PC floppy disk (3.5 inch; HD).
• [5] Thomas J. DiCiccio and Bradley Efron. Bootstrap confidence intervals., Statist. Sci., 11(3):189–228, 1996. With comments and a rejoinder by the authors.
• [6] Bradley Efron and Trevor Hastie., Computer age statistical inference, volume 5 of Institute of Mathematical Statistics (IMS) Monographs. Cambridge University Press, New York, 2016. Algorithms, evidence, and data science.
• [7] F. Galton. The most suitable proportion between the value of first and second prizes., Biometrika, 1:385–395, 1902.
• [8] Joshua D. Habiger and Edsel A. Peña. Randomised $P$-values and nonparametric procedures in multiple testing., J. Nonparametr. Stat., 23(3):583–604, 2011.
• [9] Myles Hollander, Douglas A. Wolfe, and Eric Chicken., Nonparametric statistical methods. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, third edition, 2014.
• [10] Peter M. Hooper. Invariant confidence sets with smallest expected measure., Ann. Statist., 10(4) :1283–1294, 1982.
• [11] Peter M. Hooper. Sufficiency and invariance in confidence set estimation., Ann. Statist., 10(2):549–555, 1982.
• [12] Peter M. Hooper. Invariant prediction regions with smallest expected measure., J. Multivariate Anal., 18(1):117–126, 1986.
• [13] M. C. Jones and Arthur Pewsey. Sinh-arcsinh distributions., Biometrika, 96(4):761–780, 2009.
• [14] E. L. Lehmann and Joseph P. Romano., Testing statistical hypotheses. Springer Texts in Statistics. Springer, New York, third edition, 2005.
• [15] C. C. Malesios and S. Psarakis. Comparison of the $h$-index for different fields of research using bootstrap methodology., Qual. Quant., 48:521–545, 2014.
• [16] Edsel A. Peña and Elizabeth H. Slate. Global validation of linear model assumptions., J. Amer. Statist. Assoc., 101(473):341–354, 2006.
• [17] F. Proschan. Theoretical explanation of observing decreasing failure rate., Technometrics, 5:375–383, 1963.
• [18] R Core Team., R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2016.
• [19] Ronald H. Randles and Douglas A. Wolfe., Introduction to the theory of nonparametric statistics. John Wiley & Sons, New York-Chichester-Brisbane, 1979. Wiley Series in Probability and Mathematical Statistics.
• [20] William R. Thompson. On confidence ranges for the median and other expectation distributions for populations of unknown distribution form., Ann. Math. Statist., 7(3):122–128, 09 1936.