The Annals of Statistics

Parametric deconvolution of positive spike trains

Lei Li and Terence P. Speed

Full-text: Open access

Abstract

This paper describes a parametric deconvolution method (PDPS) appropriate for a particular class of signals which we call spike-convolution models. These models arise when a sparse spike train —Dirac deltas according to our mathematical treatment —is convolved with a fixed point-spread function, and additive noise or measurement error is superimposed.We view deconvolution as an estimation problem, regarding the locations and heights of the underlying spikes, as well as the baseline and the measurement error variance as unknown parameters.Our estimation scheme consists of two parts: model fitting and model selection.To fit a spike-convolution model of a specific order, we estimate peak locations by trigonometric moments, and heights and the baseline by least squares. The model selection procedure has two stages. Its first stage is so designed that we expect a model of a somewhat larger order than the truth to be selected. In the second stage, the final model is obtained using backwards deletion. This results in not only an estimate of the model order, but also an estimate of peak locations and heights with much smaller bias and variation than that found in a direct trigonometric moment estimate. A more efficient maximum likelihood estimate can be calculated from these estimates using a Gauss–Newton algorithm. We also present some relevant results concerning the spectral structure of Toeplitz matrices which play a key role in the estimation. Finally, we illustrate the behavior of these estimates using simulated and real DNA sequencing data.

Article information

Source
Ann. Statist., Volume 28, Number 5 (2000), 1279-1301.

Dates
First available in Project Euclid: 12 March 2002

Permanent link to this document
https://projecteuclid.org/euclid.aos/1015957394

Digital Object Identifier
doi:10.1214/aos/1015957394

Mathematical Reviews number (MathSciNet)
MR1805784

Zentralblatt MATH identifier
1105.62382

Subjects
Primary: 62F10: Point estimation
Secondary: 62F12: Asymptotic properties of estimators 86A22: Inverse problems [See also 35R30]

Keywords
Deconvolution spike train model selection DNA sequencing Toeplitz matrix

Citation

Li, Lei; Speed, Terence P. Parametric deconvolution of positive spike trains. Ann. Statist. 28 (2000), no. 5, 1279--1301. doi:10.1214/aos/1015957394. https://projecteuclid.org/euclid.aos/1015957394


Export citation

References

  • [1] Adams, M. D., Fields, C. and Ventor, J. C. (eds.). (1994). Automated DNA Sequencing and Analysis. Academic Press, London.
  • [2] Billingsley. P. (1986). Probability and Measure. Wiley, New York.
  • [3] Chen, W.-Q. and Hunkapiller, T. (1992). Sequence accuracy of larger DNA sequencing projects. J. DNA Sequencing and Mapping 2 335-342.
  • [4] Di Ges, V. and Maccarone, M. C. (1984). The Bayesian direct deconvolution method: properties and and applications. Signal Processing 6 201-211.
  • [5] Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object. J. Roy. Statist. Soc. Ser. B 54 41-81.
  • [6] Durrett, R. (1991). Probability: Theory and Examples. Wadsworth and Brooks Cole, Belmont, CA.
  • [7] Fredkin, D. R. and Rice, J. A. (1997). Fast evaluation of the likelihood of an HMM: ion channel currents with filetering and colored noise. Dept. Statistics, Univ. California, Berkeley.
  • [8] Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. John Hopkins Univ. Press.
  • [9] Grenander, U. and Szeg ¨o, G. (1958). Toeplitz Forms and Their Applications. Univ. California Press, Berkeley.
  • [10] Gull, S. F. (1989). Developments in maximum entropy data analysis. In Maximum Entropy and Bayesian Methods (J. Skilling, ed.) Kluwer, Boston.
  • [11] Gull, S. F. and Daniell, G. J. (1978). Image reconstruction from incomplete and noisy data. Nature 272 686-690.
  • [12] Huang, D. (1992). Symmetric solutions and eigenvalue problems of Toeplitz systems. IEEE Trans. Acoust. Speech Signal Processing 40 3069-3074.
  • [13] Jansson, P. A. (ed.). (1997). Deconvolution of Images and Spectra. Academic Press, New York.
  • [14] Kennett, T. J., Prestwich, W. V. and Robertson, A. (1978). Bayesian deconvolution 1. convergence properties. Nuclear Instrument and Methods 151 285-292.
  • [15] Kennett, T. J., Prestwich, W. V. and Robertson, A. (1978). Bayesian deconvolution 2. noise properties. Nuclear Instrument and Methods 151 293-301.
  • [16] Kennett, T. J., Prestwich, W. V. and Robertson, A. (1978). Bayesian deconvolution 3. application and algorithm implementation. Nuclear Instrument and Methods 153 125-135.
  • [17] Koop, B. F., Rowen, L., Chen, W.-Q., Deshpande, P., Lee, H. and Hood, L. (1993). Sequence length and error analysis of sequence and automated taq cycle seqeuncing methods. BioTechniques 14 442-447.
  • [18] Lawrence, C. B. and Solovyev, V. V. (1994). Assignment of position-specific error probability to primary DNA sequence data. Nucleic Acid Research 22 1272-1280.
  • [19] Lawson, C. L. and Hanson, R. J. (1974). Solving Least Squares Problems. Prentice Hall, Englewood Cliff, NJ.
  • [20] Li, L. (1998). Statistical models of DNA base-calling. Ph.D. dissertation, Univ. California, Berkeley.
  • [21] Li, L. and Speed, T. P. (1999). An estimate of the color separation matrix in four-dye fluorescence-based DNA sequencing. Electrophoresis 20 1433-1442.
  • [22] Makhoul, J. (1981). On the eigenvectors of symmetric Toeplitz matrices. IEEE Trans. Acoust. Speech. Signal Processing 29 868-872.
  • [23] Nelson, D. O. (1995). Introduction of reptation. Technical report, Lawrence Livermore National Lab.
  • [24] Pisarenko, V. F. (1973). The retrieval of harmonics from a convariance function. Geophys. J. Roy. Astrophys. Soc. 33 347-366.
  • [25] Poskitt, D. S., Dogancay, K. and Chung, S.-H. (1999). Double-blind deconvolution: the analysis of post-synaptic currents in nerve cells. J. Roy. Statist. Soc. Ser. B 61 191-212.
  • [26] Richardson, W. H. (1972). Bayesian-based iterative method of image restoration. J. Opt. Soc. Amer. A 62 55-59.
  • [27] Shepp, L. A. and Vardi, Y. (1982). Maximum-likelihood reconstruction for emission tomography. IEEE Trans. Medical Imaging MI-1 113-121.
  • [28] Snyder, D. L., Schulz, T. J. and O'Sullivan, J. A. (1992). Deblurring subject to nonnegativity constraints. IEEE Trans. Signal Processing 40 1143-1150.
  • [29] Stark, P. B. and Parker, R. L. (1995). Bounded-variable least-squares: an algorithm and applications. Comput. Statist. 10 129-141.
  • [30] Tikhonov, A. (1963). Solution of incorrectly formulated problems and the regularization method. Soviet Math. Dokl. 5 1035-1038.
  • [31] Ulrych, T. J. and Sacchi, M. D. (1995). Sompi, Pisarenko and the extended information criterion. Geophysical J. 122 719-724.
  • [32] Vardi, Y. and Lee, D. (1993). From image deblurring to optimal investment: maximum likelihood solutions for positive linear inverse problems. J. Roy. Statist. Soc. Ser. B 55 569-612.