The Annals of Applied Statistics

Predicting Melbourne ambulance demand using kernel warping

Zhengyi Zhou and David S. Matteson

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Predicting ambulance demand accurately in fine resolutions in space and time is critical for ambulance fleet management and dynamic deployment. Typical challenges include data sparsity at high resolutions and the need to respect complex urban spatial domains. To provide spatial density predictions for ambulance demand in Melbourne, Australia, as it varies over hourly intervals, we propose a predictive spatio-temporal kernel warping method. To predict for each hour, we build a kernel density estimator on a sparse set of the most similar data from relevant past time periods (labeled data), but warp these kernels to a larger set of past data irregardless of time periods (point cloud). The point cloud represents the spatial structure and geographical characteristics of Melbourne, including complex boundaries, road networks and neighborhoods. Borrowing from manifold learning, kernel warping is performed through a graph Laplacian of the point cloud and can be interpreted as a regularization toward, and a prior imposed for, spatial features. Kernel bandwidth and degree of warping are efficiently estimated via cross-validation, and can be made time- and/or location-specific. Our proposed model gives significantly more accurate predictions compared to a current industry practice, an unwarped kernel density estimation and a time-varying Gaussian mixture model.

Article information

Ann. Appl. Stat., Volume 10, Number 4 (2016), 1977-1996.

Received: July 2015
Revised: April 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Emergency medical service kernel density estimation manifold learning graph Laplacian


Zhou, Zhengyi; Matteson, David S. Predicting Melbourne ambulance demand using kernel warping. Ann. Appl. Stat. 10 (2016), no. 4, 1977--1996. doi:10.1214/16-AOAS961.

Export citation


  • Aggarwal, C. C. (2003). A framework for diagonosing changes in evolving data streams. In ACM SIGMOD International Conference on Management of Data 575–586. ACM, New York.
  • Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15 1373–1396.
  • Belkin, M. and Niyogi, P. (2004). Semi-supervised learning on Riemannian manifolds. Mach. Learn. 56 209–239.
  • Belkin, M. and Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In Learning Theory. Lecture Notes in Computer Science 3559 486–500. Springer, Berlin.
  • Belkin, M., Niyogi, P. and Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7 2399–2434.
  • Bousquet, O., Chapelle, O. and Hein, M. (2005). Measure based regularization. In Advances in Neural Information Processing Systems. MIT Press, Cambridge.
  • Cacoullos, T. (1966). Estimation of a multivariate density. Ann. Inst. Statist. Math. 18 179–189.
  • Channouf, N., L’Ecuyer, P., Ingolfsson, A. and Avramidis, A. N. (2007). The application of forecasting techniques to modeling emergency medical system calls in Calgary, Alberta. Health Care Manag. Sci. 10 25–45.
  • Diebold, F. and Mariano, R. (1995). Comparing predictive accuracy. J. Bus. Econom. Statist. 13 253–263.
  • Diggle, P. J. (2003). Statistical Analysis of Spatial Point Patterns, 2nd ed. Arnold, London.
  • Donoho, D. L. and Grimes, C. (2005). Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data. In Proceedings of the National Academy of Sciences 102. National Academy of Sciences, Washington DC.
  • Duong, T. and Hazelton, M. L. (2005). Cross-validation bandwidth matrices for multivariate kernel density estimation. Scand. J. Stat. 32 485–506.
  • Ertöz, L., Steinbach, M. and Kumar, V. (2003). Finding clusters of different sizes, shapes and densities in noisy, high dimensional data. In Proceedings of the SIAM International Conference on Data Mining 47–58. SIAM, Philadelphia.
  • Ester, M., Kriegel, H. P., Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noice. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 226–231. ACM, New York.
  • Frey, B. J. and Dueck, D. (2007). Clustering by passing messages between data points. Science 315 972–976.
  • Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359–378.
  • Goldberg, J. B. (2004). Operations research methods for the deployment of emergency service vehicles. EMS Management Journal 1 20–39.
  • Good, I. J. (1952). Rational decisions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 14 107–114.
  • Google Maps (2015). Map of Melbourne, Australia. Web.
  • Gray, A. G. and Moore, A. W. (2003). Nonparametric density estimation: Toward computational tractability. In Proceedings of the SIAM International Conference on Data Mining. SIAM, Philadelphia.
  • Matteson, D. S., McLean, M. W., Woodard, D. B. and Henderson, S. G. (2011). Forecasting emergency medical service call arrival rates. Ann. Appl. Stat. 5 1379–1406.
  • Merris, R. (1994). Laplacian matrices of graphs: A survey. Linear Algebra Appl. 197/198 143–176.
  • Møller, J. and Waagepetersen, R. P. (2004). Statistical Inference and Simulation for Spatial Point Processes. Monographs on Statistics and Applied Probability 100. Chapman & Hall/CRC, Boca Raton, FL.
  • Nakaya, T. and Yano, K. (2010). Visualising crime clusters in a space–time cube: An exploratory data analysis approach using space–time kernel density estimation and scan statistics. Transactions in GIS 14 223–239.
  • Ng, A., Jordan, M. and Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 849–856. MIT Press, Cambridge.
  • Ramsay, T. (2002). Spline smoothing over difficult regions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 307–319.
  • Regis, R. G. and Shoemaker, C. A. (2007). A stochastic radial basis function method for the global optimization of expensive functions. INFORMS J. Comput. 19 497–509.
  • Regis, R. G. and Shoemaker, C. A. (2009). Parallel stochastic global optimization using radial basis functions. INFORMS J. Comput. 21 411–426.
  • Ripley, B. D. and Rasson, J.-P. (1977). Finding the edge of a Poisson forest. J. Appl. Probab. 14 483–491.
  • Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science 290 2323–2326.
  • Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, New York.
  • Setzler, H., Saydam, C. and Park, S. (2009). EMS call volume predictions: A comparative study. Comput. Oper. Res. 36 1843–1851.
  • Shi, J. and Malik, J. (2000). Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22 888–905.
  • Sindhwani, V., Niyogi, P. and Belkin, M. (2005). Beyond the point cloud: From transductive to semi-supervised learning. In Proceedings of the 22nd International Conference on Machine Learning 824–831. ACM, New York.
  • Smola, A. J. and Kondor, R. (2003). Kernels and regularization on graphs. In Learning Theory and Kernel Machines, Lecture Notes in Computer Science 144–158. Springer, Berlin.
  • Tenebaum, J. B., de Silva, V. and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science 290 2319–2323.
  • van der Maaten, L. J. P., Postma, E. O. and van den Herik, H. J. (2009). Dimensionality reduction: A comparative review. J. Mach. Learn. Res. 10 66–71.
  • Vile, J. L., Gillard, J. W., Harper, P. R. and Knight, V. A. (2012). Predicting ambulance demand using singular spectrum analysis. Journal of the Operations Research Society 63 1556–1565.
  • Wand, M. P. and Jones, M. C. (1994). Multivariate plug-in bandwidth selection. Comput. Statist. 9 97–116.
  • Wood, S. N., Bravington, M. V. and Hedley, S. L. (2008). Soap film smoothing. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 931–955.
  • Woodworth, J. T., Mohler, G. O., Bertozzi, A. L. and Brantingham, P. J. (2014). Non-local crime density estimation incorporating housing information. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 372 20130403, 15.
  • Zhang, Z., Chen, D., Liu, W., Racine, J. S., Ong, S. H., Cheng, Y., Zhao, G. and Jiang, Q. (2011). Nonparametric evaluation of dynamic disease risk: A spatio-temporal kernel approach. PLoS ONE 6.
  • Zhou, Z. and Matteson, D. S. (2015). Predicting ambulance demand: A spatio-temporal kernel approach. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York.
  • Zhou, D., Bousquet, O., Lal, T. N., Weston, J. and Schoelkopf, B. (2003). Learning with local and global consistency. In Advances in Neural Information Processing Systems. MIT Press, Cambridge.
  • Zhou, Z., Matteson, D. S., Woodard, D. B., Henderson, S. G. and Micheas, A. C. (2015). A spatio-temporal point process model for ambulance demand. J. Amer. Statist. Assoc. 110 6–15.
  • Zhu, X., Kandola, J., Ghahramami, Z. and Lafferty, J. (2005). Nonparametric transforms of graph kernels for semi-supervised learning. In Advances in Neural Information Processing Systems. MIT Press, Cambridge.