The Annals of Statistics

On the bias in estimating genetic length and other quantities in simplex constrained models

Arthur Cohen, J.H.B. Kemperman, and Harold Sackrowitz

Full-text: Open access


The genetic distance between two loci on a chromosome is defined as the mean number of crossovers between the loci. The parameters of the crossover distribution are constrained by the parameters of the distribution of chiasmata. Ott (1996) derived the maximum likelihood estimator (MLE) of the parameters of the crossover distribution and the MLE of the mean. We demonstrate that the MLE of the mean is pointwise less than or equal to the empirical mean number of crossovers. It follows that the MLE is negatively biased. For small sample sizes the bias can be nonnegligible. We recommend reduced bias estimators.

Generalizations to many other problems involving linear constraints on parameters are made. Included in the generalizations are a variety of problems involving simplex constraints as studied recently by Liu (2000).

Article information

Ann. Statist., Volume 30, Number 1 (2002), 202-219.

First available in Project Euclid: 5 March 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 92D10: Genetics {For genetic algebras, see 17D92} 62F10: Point estimation

Crossovers chiasma maximum likelihood estimation order-restricted inference nonlinear programming


Cohen, Arthur; Kemperman, J.H.B.; Sackrowitz, Harold. On the bias in estimating genetic length and other quantities in simplex constrained models. Ann. Statist. 30 (2002), no. 1, 202--219. doi:10.1214/aos/1015362190.

Export citation


  • ANDERSON, T. W. (1971). The Statistical Analysis of Time Series. Wiley, New York.
  • COHEN, A., KEMPERMAN, J. H. B. and SACKROWITZ, H. B. (1994). Unbiased testing in exponential family regression. Ann. Statist. 22 1931-1946.
  • LEE, C. C. (1988). Quadratic loss of order restricted estimators for treatment means with a control. Ann. Statist. 16 751-758.
  • LIU, C. (2000). Estimation of discrete distributions with a class of simplex constraints. J. Amer. Statist. Assoc. 95 109-120.
  • MATHER, K. (1933). The relation between chiasmata and crossing-over in diploid and triploid Drosophila melanogaster. J. Genetics 27 243-259.
  • MATHER, K. (1938). Crossing-over. Biol. Rev. Cambridge Philos. Soc. 13 252-292.
  • OTT, J. (1996). Estimating crossover frequencies and testing for numerical interference with highly polymorphic markers. In Genetic Mapping and DNA Sequencing (T. Speed and M. S. Waterman, eds.) 49-63. Springer, New York.
  • ROBERTSON, T., WRIGHT, F. T. and DYKSTRA, R. L. (1988). Order Restricted Statistical Inference. Wiley, New York.
  • YU, K. and FEINGOLD, E. (2001). Estimating the frequency distribution of crossovers during meiosis from recombination data. Biometrics 57 427-434.
  • ZANGWILL, W. I. and MOND, B. (1969). Nonlinear Programming: A Unified Approach. Prentice- Hall, Englewood Cliffs, NJ.