The Annals of Applied Statistics

Probit models for capture–recapture data subject to imperfect detection, individual heterogeneity and misidentification

Brett T. McClintock, Larissa L. Bailey, Brian P. Dreher, and William A. Link

Full-text: Open access


As noninvasive sampling techniques for animal populations have become more popular, there has been increasing interest in the development of capture–recapture models that can accommodate both imperfect detection and misidentification of individuals (e.g., due to genotyping error). However, current methods do not allow for individual variation in parameters, such as detection or survival probability. Here we develop misidentification models for capture–recapture data that can simultaneously account for temporal variation, behavioral effects and individual heterogeneity in parameters. To facilitate Bayesian inference using our approach, we extend standard probit regression techniques to latent multinomial models where the dimension and zeros of the response cannot be observed. We also present a novel Metropolis–Hastings within Gibbs algorithm for fitting these models using Markov chain Monte Carlo. Using closed population abundance models for illustration, we re-visit a DNA capture–recapture population study of black bears in Michigan, USA and find evidence of misidentification due to genotyping error, as well as temporal, behavioral and individual variation in detection probability. We also estimate a salamander population of known size from laboratory experiments evaluating the effectiveness of a marking technique commonly used for amphibians and fish. Our model was able to reliably estimate the size of this population and provided evidence of individual heterogeneity in misidentification probability that is attributable to variable mark quality. Our approach is more computationally demanding than previously proposed methods, but it provides the flexibility necessary for a much broader suite of models to be explored while properly accounting for uncertainty introduced by misidentification and imperfect detection. In the absence of misidentification, our probit formulation also provides a convenient and efficient Gibbs sampler for Bayesian analysis of traditional closed population capture–recapture data.

Article information

Ann. Appl. Stat., Volume 8, Number 4 (2014), 2461-2484.

First available in Project Euclid: 19 December 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Data augmentation individual heterogeneity latent multinomial mark-recapture missing data population size probit regression record linkage


McClintock, Brett T.; Bailey, Larissa L.; Dreher, Brian P.; Link, William A. Probit models for capture–recapture data subject to imperfect detection, individual heterogeneity and misidentification. Ann. Appl. Stat. 8 (2014), no. 4, 2461--2484. doi:10.1214/14-AOAS783.

Export citation


  • Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.
  • Bailey, L. L. (2004). Evaluating elastomer marking and photo-identification methods for terrestrial salamanders: Marking effects and observer bias. Herpetological Review 35 38–41.
  • Basu, S. and Ebrahimi, N. (2001). Bayesian capture–recapture methods for error detection and estimation of population size: Heterogeneity and dependence. Biometrika 88 269–279.
  • Bonner, S. J. and Holmberg, J. (2013). Mark-recapture with multiple, non-invasive marks. Biometrics 69 766–775.
  • Bonner, S. J. and Schofield, M. R. (2013). MC(MC)MC: Exploring Monte Carlo integration with MCMC for mark-recapture models with individual covariates. Methods in Ecology and Evolution. DOI:10.1111/2041-210X.12095.
  • Castledine, B. J. (1981). A Bayesian analysis of multiple-recapture sampling for a closed population. Biometrika 68 197–210.
  • Coull, B. A. and Agresti, A. (1999). The use of mixed logit models to reflect heterogeneity in capture–recapture studies. Biometrics 55 294–301.
  • Darroch, J. N. (1958). The multiple-recapture census. I. Estimation of a closed population. Biometrika 45 343–359.
  • Dorazio, R. M. and Rodriguez, D. T. (2012). A Gibbs sampler for Bayesian analysis of site-occupancy data. Methods in Ecology and Evolution 3 1093–1098.
  • Dreher, B. P., Winterstein, S. R., Scribner, K. T., Lukacs, P. M., Etter, D. R., Rosa, G. J. M., Lopez, V. A., Libants, S. and Filcek, K. B. (2007). Noninvasive estimation of black bear abundance incorporating genotyping errors and harvested bears. Journal of Wildlife Management 71 2684–2693.
  • Fienberg, S. E., Johnson, M. S. and Junker, B. W. (1999). Classical multilevel and Bayesian approaches to population size estimation using multiple lists. J. Roy. Statist. Soc. Ser. A 162 383–406.
  • Fienberg, S. E. and Manrique-Vallier, D. (2009). Integrated methodology for multiple systems estimation and record linkage using a missing data formulation. AStA Adv. Stat. Anal. 93 49–60.
  • George, E. I. and Robert, C. P. (1992). Capture–recapture estimation via Gibbs sampling. Biometrika 79 677–683.
  • Gimenez, O. and Choquet, R. (2010). Individual heterogeneity in studies on marked animals using numerical integration: Capture–recapture mixed models. Ecology 91 951–957.
  • Hall, A. J., McConnell, B. J. and Barker, R. J. (2001). Factors affecting first-year survival in grey seals and their implications for life history strategy. Journal of Animal Ecology 70 138–149.
  • Hastings, K. K., Hiby, L. A. and Small, R. J. (2008). Evaluation of a computer-assisted photograph-matching system to monitor naturally marked harbor seals at Tugidak Island, Alaska. Journal of Mammalogy 89 1201–1211.
  • Johnson, D. S., Conn, P. B., Hooten, M. B., Ray, J. C. and Pone, B. A. (2013). Spatial occupancy models for large data sets. Ecology 94 801–808.
  • Karanth, K. U. and Nichols, J. D. (1998). Estimation of tiger densities in India using photographic captures and recaptures. Ecology 79 2852–2862.
  • Kauffman, M. J., Frick, W. F. and Linthicum, J. (2003). Estimation of habitat-specific demography and population growth for peregrine falcans in California. Ecological Applications 13 1802–1816.
  • Kernighan, B. W. and Ritchie, D. M. (1988). The C Programming Language, 2nd ed. Prentice Hall, Englewood Cliffs, NJ.
  • King, R. and Brooks, S. P. (2008). On the Bayesian estimation of a closed population size in the presence of heterogeneity and model uncertainty. Biometrics 64 816–824.
  • King, R., Brooks, S. P. and Coulson, T. (2008). Analyzing complex capture–recapture data in the presence of individual and temporal covariates and model uncertainty. Biometrics 64 1187–1195.
  • Langtimm, C. A., O’Shea, T. J., Pradel, R. and Beck, C. A. (1998). Estimates of annual survival probabilities for adult Florida manatees Trichechus manatus latirostris. Ecology 79 981–997.
  • Link, W. A. (2013). A cautionary note on the discrete uniform prior for the binomial N. Ecology 94 2173–2179.
  • Link, W. A., Yoshizaki, J., Bailey, L. L. and Pollock, K. H. (2010). Uncovering a latent multinomial: Analysis of mark-recapture data with misidentification. Biometrics 66 178–185.
  • Lukacs, P. M. and Burnham, K. P. (2005). Estimating population size from DNA-based closed capture–recapture data incorporating genotyping error. Journal of Wildlife Management 69 396–403.
  • Mackey, B. L., Durban, J. W., Middlemas, S. J. and Thompson, P. M. (2008). A Bayesian estimate of harbour seal survival using sparse photo-identification data. Journal of Zoology 274 18–27.
  • Manrique-Vallier, D. and Fienberg, S. E. (2008). Population size estimation using individual level mixture models. Biom. J. 50 1051–1063.
  • McClintock, B. T., Conn, P. B., Alonso, R. S. and Crooks, K. R. (2013a). Integrated modeling of bilateral photo-identification data in mark-recapture analyses. Ecology 94 1464–1471.
  • McClintock, B. T., Hill, J. M., Fritz, L., Chumbley, K., Luxa, K. and Diefenbach, D. R. (2013b). Mark-resight abundance estimation under incomplete identification of marked individuals. Methods in Ecology and Evolution. DOI:10.1111/2041-210X.12140.
  • Morrison, T. A., Yoshizaki, J., Nichols, J. D. and Bolger, D. T. (2011). Estimating survival in photographic capture–recapture studies: Overcoming misidentification error. Methods in Ecology and Evolution 2 454–463.
  • Otis, D. L., Burnham, K. P., White, G. C. and Anderson, D. R. (1978). Statistical-inference from capture data on closed animal populations. Wildlife Monographs 62 7–135.
  • Pledger, S. (2000). Unified maximum likelihood estimates for closed capture–recapture models using mixtures. Biometrics 56 434–442.
  • Plummer, M., Best, N., Cowles, K. and Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R News 6 7–11.
  • Polson, N. G., Scott, J. G. and Windle, J. (2013). Bayesian inference for logistic models using Pólya–Gamma latent variables. J. Amer. Statist. Assoc. 108 1339–1349.
  • Pradel, R. (2005). Multievent: An extension of multistate capture–recapture models to uncertain states. Biometrics 61 442–447.
  • R Core Team (2012). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  • Royle, J. A. (2008). Modeling individual effects in the Cormack–Jolly–Seber model: A state-space formulation. Biometrics 64 364–370, 664.
  • Royle, J. A., Dorazio, R. M. and Link, W. A. (2007). Analysis of multinomial models with unknown index using data augmentation. J. Comput. Graph. Statist. 16 67–85.
  • Ruell, E. W., Riley, S. P. D., Douglas, M. R., Pollinger, J. P. and Crooks, K. R. (2009). Estimating bobcat population sizes and densities in a fragmented urban landscape using noninvasive capture–recapture sampling. Journal of Mammalogy 90 129–135.
  • Tancredi, A. and Liseo, B. (2011). A hierarchical Bayesian approach to record linkage and population size problems. Ann. Appl. Stat. 5 1553–1585.
  • Tancredi, A., Auger-Méthé, M., Marcoux, M. and Liseo, B. (2013). Accounting for matching uncertainty in two stage capture–recapture experiments using photographic measurements of natural marks. Environ. Ecol. Stat. 20 647–665.
  • Thompson, S. K. (1992). Sampling. Wiley, New York.
  • White, G. C. and Burnham, K. P. (1999). Program MARK: Survival estimation from populations of marked animals. Bird Study 46 Supplement 120–138.
  • Williams, B. K., Nichols, J. D. and Conroy, M. J. (2002). Analysis and Management of Animal Populations. Academic Press, San Diego, CA.
  • Wright, J. A., Barker, R. J., Schofield, M. R., Frantz, A. C., Byrom, A. E. and Gleeson, D. M. (2009). Incorporating genotype uncertainty into mark-recapture-type models for estimating abundance using DNA samples. Biometrics 65 833–840.
  • Yip, P. S. F., Bruno, G., Tajima, N., Seber, G. A. F., Buckland, S. T., Cormack, R. M., Unwin, N., Chang, Y. F., Fienberg, S. E., Junker, B. W., LaPorte, R. E., Libman, I. M. and McCarty, D. J. (1995a). Capture–recapture and multiple-record systems estimation I: History and theoretical development. American Journal of Epidemiology 142 1047–1058.
  • Yip, P. S. F., Bruno, G., Tajima, N., Seber, G. A. F., Buckland, S. T., Cormack, R. M., Unwin, N., Chang, Y. F., Fienberg, S. E., Junker, B. W., LaPorte, R. E., Libman, I. M. and McCarty, D. J. (1995b). Capture–recapture and multiple-record systems estimation II: Applications in human diseases. American Journal of Epidemiology 142 1059–1068.
  • Yoshizaki, J. (2007). Use of natural tags in closed population capture–recapture studies: Modeling misidentification. Ph.D. thesis, North Carolina State Univ., Raleigh, NC.
  • Yoshizaki, J., Pollock, K. H., Brownie, C. and Webster, R. A. (2009). Modeling misidentification errors in capture–recapture studies using photographic identification of evolving marks. Ecology 90 3–9.
  • Yoshizaki, J., Brownie, C., Pollock, K. H. and Link, W. A. (2011). Modeling misidentification errors that result from use of genetic tags in capture–recapture studies. Environ. Ecol. Stat. 18 27–55.