The Annals of Applied Statistics

A random-effects hurdle model for predicting bycatch of endangered marine species

E. Cantoni, J. Mills Flemming, and A. H. Welsh

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Understanding and reducing the incidence of accidental bycatch, particularly for vulnerable species such as sharks, is a major challenge for contemporary fisheries management worldwide. Bycatch data, most often collected by at-sea observers during fishing trips, are clustered by trip and/or vessel and typically involve a large number of zero counts and very few positive counts. Though hurdle models are very popular for count data with excess zeros, models for clustered forms have received far less attention. Here we present a novel random-effects hurdle model for bycatch data that makes available accurate estimates of bycatch probabilities as well as other cluster-specific targets. These are essential for informing conservation and management decisions as well as for identifying bycatch hotspots, often considered the first step in attempting to protect endangered marine species. We validate our methodology through simulation and use it to analyze bycatch data on critically endangered hammerhead sharks from the U.S. National Marine Fisheries Service Pelagic Observer Program.

Article information

Ann. Appl. Stat., Volume 11, Number 4 (2017), 2178-2199.

Received: June 2015
Revised: June 2017
First available in Project Euclid: 28 December 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bycatch clustered count data excess of zeros random-effects hurdle models prediction


Cantoni, E.; Mills Flemming, J.; Welsh, A. H. A random-effects hurdle model for predicting bycatch of endangered marine species. Ann. Appl. Stat. 11 (2017), no. 4, 2178--2199. doi:10.1214/17-AOAS1074.

Export citation


  • Alfò, M. and Maruotti, A. (2010). Two-part regression models for longitudinal zero-inflated count data. Canad. J. Statist. 38 197–216.
  • Baum, J. (2007). Population- and community-level consequences of the exploitation of large predatory marine fishes, Ph.D. thesis, Biology Department, Dalhousie University, Halifax, Canada.
  • Baum, J. K., Myers, R. A., Kehler, D. G., Worm, B., Harley, S. J. and Doherty, P. A. (2003). Collapse and conservation of shark populations in the Northwest Atlantic. Science 299 389–392.
  • Booth, J. G. and Hobert, J. P. (1998). Standard errors of prediction in generalized linear mixed models. J. Amer. Statist. Assoc. 93 262–272.
  • Cantoni, E., Mills Flemming, J. and Welsh, A. H (2017). Supplement to “A random-effects hurdle model for predicting bycatch of endangered marine species.” DOI:10.1214/17-AOAS1074SUPP.
  • Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9–25.
  • de Bruijn, N. G. (1981). Asymptotic Methods in Analysis, 3rd ed. Dover Publications, New York.
  • Dobbie, M. J. and Welsh, A. H. (2001). Modelling correlated zero-inflated count data. Aust. N. Z. J. Stat. 43 431–444.
  • Fournier, D. A., Skaug, H. J., Ancheta, J., Ianelli, J., Magnusson, A., Maunder, M. N., Nielsen, A. and Sibert, J. (2012). AD model builder: Using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optim. Methods Softw. 27 233–249.
  • Fulton, K. A., Liu, D., Haynie, D. L. and Albert, P. S. (2015). Mixed model and estimating equation approaches for zero inflation in clustered binary response data with application to a dating violence study. Ann. Appl. Stat. 9 275–299.
  • Furrer, R., Nychka, D. and Sain, S. (2012). fields: Tools for spatial data. R package version 6.6.3.
  • Hall, M. A., Alverson, D. L. and Metuzals, K. I. (2000). By-catch: Problems and solutions. Mar. Pollut. Bull. 41 204–219.
  • Huber, P., Ronchetti, E. and Victoria-Feser, M.-P. (2004). Estimation of generalized linear latent variable models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 893–908.
  • Hur, K., Hedeker, D., Henderson, W., Khuri, S. and Daley, J. (2002). Modeling clustered count data with excess zeros in health care outcomes research. Health Serv. Outcomes Res. Methodol. 3 5–20.
  • Jiang, J. (2003). Empirical best prediction for small-area inference based on generalized linear mixed models. J. Statist. Plann. Inference 111 117–127.
  • Jiang, J. (2007). Linear and Generalized Linear Mixed Models and Their Applications. Springer, New York.
  • Jiang, J., Jia, H. and Chen, H. (2001). Maximum posterior estimation of random effects in generalized linear mixed models. Statist. Sinica 11 97–120.
  • Jiang, J. and Lahiri, P. (2001). Empirical best prediction for small area inference with binary data. Ann. Inst. Statist. Math. 53 217–243.
  • Jiang, J., Lahiri, P. and Wan, S.-M. (2002). A unified jackknife theory for empirical best prediction with $M$-estimation. Ann. Statist. 30 1782–1810.
  • Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34 1–14.
  • Lee, Y. and Nelder, J. A. (1996). Hierarchical generalized linear models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58 619–678.
  • Lele, S. R., Dennis, B. and Lutscher, F. (2007). Data cloning: Easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecol. Lett. 10 551–563.
  • Lewison, R. L., Crowder, L. B., Read, A. J. and Freeman, S. A. (2004). Understanding impacts of fisheries bycatch on marine megafauna. Trends Ecol. Evol. 19 598–604.
  • Liu, L., Strawderman, R. L., Cowen, M. E. and Shih, Y. C. T. (2010). A flexible two-part random effects model for correlated medical costs. J. Health Econ. 29 110–123.
  • McCulloch, C. E. and Neuhaus, J. M. (2011). Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statist. Sci. 26 388–402.
  • Min, Y. and Agresti, A. (2002). Modeling nonnegative data with clumping at zero: A survey. J. Iran. Stat. Soc. (JIRSS) 1 7–33.
  • Min, Y. and Agresti, A. (2005). Random effect models for repeated measures of zero-inflated count data. Stat. Model. 5 1–19.
  • Molas, M. and Lesaffre, E. (2010). Hurdle models for multilevel zero-inflated data via $h$-likelihood. Stat. Med. 29 3294–3310.
  • Mullahy, J. (1986). Specification and testing of some modified count data models. J. Econometrics 33 341–365.
  • Myers, R. A., Baum, J. K., Shepherd, T. D., Powers, S. P. and Peterson, C. H. (2007). Cascading effects of the loss of apex predatory sharks from a coastal ocean. Science 315 1846–1850.
  • Neelon, B., Ghosh, P. and Loebs, P. F. (2013). A spatial Poisson hurdle model for exploring geographic variation in emergency department visits. J. Roy. Statist. Soc. Ser. A 176 389–413.
  • Pikitch, E., Santora, C., Babcock, E., Bakun, A., Bonfil, R., Conover, D., Dayton, P., Doukakis, P., Fluharty, D. et al. (2004). Ecosystem-based fishery management. Science 305 346–347.
  • R Development Core Team (2011). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  • Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal 2 1–21.
  • Salibián-Barrera, M. a., Van Aelst, S. and Willems, G. (2008). Fast and robust bootstrap. Stat. Methods Appl. 17 41–71.
  • Skaug, H., Fournier, D., Nielsen, A., Magnusson, A. and Bolker, B. (2012). Generalized Linear Mixed Models using AD Model Builder. R package version 0.7.4.
  • Su, L., Tom, B. D. M. and Farewell, V. T. (2009). Bias in 2-part mixed models for longitudinal semicontinuous data. Biostatistics 10 374–389.
  • Welsh, A. H., Cunningham, R. B. and Chambers, R. L. (2000). Methodology for estimating the abundance of rare animals: Seabird nesting on North East herald cay. Biometrics 56 22–30.
  • Yau, K. K. W. and Lee, A. H. (2001). Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme. Stat. Med. 20 2907–2920.

Supplemental materials

  • Supplementary Material for the paper “A random-effects hurdle model for predicting bycatch of endangered marine species”. The supplementary file contains four sections. In the first section we give a general formulation of the random effects hurdle model. The second section presents a result about maximum likelihood estimation of the model. The third section introduces a fast bootstrap estimator and establishes its asymptotic distribution. Finally, the fourth section gives additional simulation results, as discussed in this paper.