The Annals of Applied Statistics

A statistical modeling approach for air quality data based on physical dispersion processes and its application to ozone modeling

Xiao Liu, Kyongmin Yeo, Youngdeok Hwang, Jitendra Singh, and Jayant Kalagnanam

Full-text: Open access


For many complex environmental processes such as air pollution, the underlying physical mechanism usually provides valuable insights into the statistical modeling. In this paper, we propose a statistical air quality model motivated by a commonly used physical dispersion model, called the scalar transport equation. The emission of a pollutant is modeled by covariates such as land use, traffic pattern and meteorological conditions, while the transport and decay of a pollutant are modeled through a convolution approach which takes into account the dynamic wind field. This approach naturally establishes a nonstationary random field with a space–time nonseparable and anisotropic covariance structure. Note that, due to the extremely complex interactions between the pollutant and environmental conditions, the space–time covariance structure of pollutant concentration data is often dynamic and can hardly be specified or envisioned directly. The relationship between the proposed spatial-temporal model and the physics model is also shown, and the approach is applied to model the hourly ozone concentration data in Singapore.

Article information

Ann. Appl. Stat., Volume 10, Number 2 (2016), 756-785.

Received: December 2014
Revised: January 2016
First available in Project Euclid: 22 July 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Spatial-temporal modeling air quality model partial differential equation space–time nonseparable and anisotropic random field


Liu, Xiao; Yeo, Kyongmin; Hwang, Youngdeok; Singh, Jitendra; Kalagnanam, Jayant. A statistical modeling approach for air quality data based on physical dispersion processes and its application to ozone modeling. Ann. Appl. Stat. 10 (2016), no. 2, 756--785. doi:10.1214/15-AOAS901.

Export citation


  • Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2014). Hierarchical Modeling and Analysis for Spatial Data, 2nd ed. Monographs on Statistics and Applied Probability 135. CRC Press, Boca Raton, FL.
  • Berliner, L. M. (2003). Physical-statistical modeling in geophysics. Journal of Geophysical Research-Atmospheres 108 STS 3-1–STS 3-10.
  • Brown, P. E., Kåresen, K. F., Roberts, G. O. and Tonellato, S. (2000). Blur-generated non-separable space–time models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 62 847–860.
  • Byun, D. and Schere, K. L. (2006). Review of the governing equations, computational algorithms, and other components of the models-3 community multiscale air quality (CMAQ) modeling system. Applied Mechanics Reviews 59 51–77.
  • Calder, C. A. (2007). Dynamic factor process convolution models for multivariate space–time data with application to air quality assessment. Environ. Ecol. Stat. 14 229–247.
  • Cameletti, M., Lindgren, F., Simpson, D. and Rue, H. (2013). Spatio-temporal modeling of particulate matter concentration through the SPDE approach. AStA Adv. Stat. Anal. 97 109–131.
  • Carroll, R., Chen, E., Li, T., Newton, H., Schmiediche, H. and Wang, N. (1997). Ozone exposure and population density in Harris county. Texas. Journal of the American Statistical Association 92 392–404.
  • Christakos, G. and Vyas, V. (1998). A composite space–time approach to studying ozone distribution over eastern United States. Atmospheric Environment 32 2845–2857.
  • Coats, C. (1996). High performance algorithms in the sparse matrix operator kernel emissions modelling system. In Proceedings of the Ninth Joint Conference on Applications of Air Pollution Meteorology of the American Meteorological Society and the Air and Waste Management Association. Atlanta, GA.
  • Cressie, N. and Huang, H.-C. (1999). Classes of nonseparable, spatio-temporal stationary covariance functions. J. Amer. Statist. Assoc. 94 1330–1340.
  • Dou, Y., Le, N. D. and Zidek, J. V. (2010). Modeling hourly ozone concentration fields. Ann. Appl. Stat. 4 1183–1213.
  • Fuentes, M. (2009). Statistical issues in health impact assessment at the state and local levels. Air Quality, Atmosphere and Health 2 47–55.
  • Fuentes, M., Chen, L., Davis, J. M. and Lackmann, G. M. (2005). Modeling and predicting complex space–time structures and patterns of coastal wind fields. Environmetrics 16 449–464.
  • Ghosh, S. K., Bhave, P. V., Davis, J. M. and Lee, H. (2010). Spatio-temporal analysis of total nitrate concentrations using dynamic statistical models. J. Amer. Statist. Assoc. 105 538–551.
  • Gneiting, T. (2002). Nonseparable, stationary covariance functions for space–time data. J. Amer. Statist. Assoc. 97 590–600.
  • Han, S., Bian, H., Feng, Y., Liu, A., Li, X., Zeng, F. and Zhang, X. (2011). Analysis of the relationship between O3, NO and NO2 in tianjin. China. Aerosol and Air Quality Research 11 128–139.
  • Haslett, J. and Raftery, A. (1989). Space–time modelling with long-memory dependence: Assessing Ireland’s wind power resource (with discussion). Applied Statistics 38 1–50.
  • Higdon, D. (2002). Space and space–time modeling using process convolutions. In Quantitative Methods for Current Environmental Issues (C. Anderson, V. Barnett, P. Chatwind and A. El-Shaarawi, eds.) 37–56. Springer, London.
  • Higdon, D. (2007). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environ. Ecol. Stat. 5 173–190.
  • Huang, H. and Hsu, N. (2004). Modeling transport effects on ground-level ozone using a non-stationary space–time model. Environmetrics 15 251–268.
  • Liu, X., Yeo, K., Hwang, Y., Singh, J. and Kalagnanam, J. (2016). Supplement to “A statistical modeling approach for air quality data based on physical dispersion processes and its application to ozone modeling.” DOI:10.1214/15-AOAS901SUPP.
  • Malmberg, A., Arellano, A., Edwards, D. P., Flyer, N., Nychka, D. and Wikle, C. (2008). Interpolating fields of carbon monoxide data using a hybrid statistical-physical model. Ann. Appl. Stat. 2 1231–1248.
  • Reich, B. J. and Fuentes, M. (2007). A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields. Ann. Appl. Stat. 1 249–264.
  • Reich, B. J., Eidsvik, J., Guindani, M., Nail, A. J. and Schmidt, A. M. (2011). A class of covariate-dependent spatiotemporal covariance functions for the analysis of daily ozone concentration. Ann. Appl. Stat. 5 2425–2447.
  • Reich, B., Cooley, D., Foley, K., Napelenok, S. and Shaby, B. (2013). Extreme value analysis for evaluating ozone control strategies. Ann. Appl. Stat. 7 739–762.
  • Sahu, S. K., Gelfand, A. E. and Holland, D. M. (2007). High-resolution space–time ozone modeling for assessing trends. J. Amer. Statist. Assoc. 102 1221–1234.
  • Schabenberger, O. and Gotway, C. A. (2005). Statistical Methods for Spatial Data Analysis. Chapman & Hall/CRC, Boca Raton, FL.
  • Shaddick, G., Lee, D., Zidek, J. V. and Salway, R. (2008). Estimating exposure response functions using ambient pollution concentrations. Ann. Appl. Stat. 2 1249–1270.
  • Sigrist, F., Künsch, H. R. and Stahel, W. A. (2015). Stochastic partial differential equation based modelling of large space–time data sets. J. R. Stat. Soc. Ser. B. Stat. Methodol. 77 3–33.
  • Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Duda, M. G., Huang, X. Y., Wang, W. and Powers, J. G. (2008). A description of the advanced research WRF version 3, Boulder, Colorado, USA. Ncar Technical Note: NCAR/TN–475+STR.
  • Smith, L., Fuentes, M., Reich, B. and Eder, B. (2013). Prediction of speciated particulate matter and bias assessment of numerical output data. International Journal of Environmental Science and Engineering Research 4 8–17.
  • Stein, M. L. (2007). Spatial variation of total column ozone on a global scale. Ann. Appl. Stat. 1 191–210.
  • Stein, M. L. (2009). Spatial interpolation of high-frequency monitoring data. Ann. Appl. Stat. 3 272–291.
  • Stroud, J. R., Müller, P. and Sansó, B. (2001). Dynamic models for spatiotemporal data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 63 673–689.
  • United States Environmental Protection Agency (1998). EPA third-generation air quality modeling system, models-3 (EPA-600/R-98/069a). U.S. Environmental Protection Agency, Research Triangle Park, NC.
  • United States Environmental Protection Agency (2003). Air Quality index—A guide to air quality and your health, EPA-454/K-03-002. U.S. Environmental Protection Agency, Research Triangle Park, NC.
  • United States Environmental Protection Agency (2012). Clean air act: Title I—Air pollution prevention and control. Available at
  • Wikle, C. K., Milliff, R. F., Nychka, D. and Berliner, L. M. (2001). Spatiotemporal hierarchical Bayesian modeling: Tropical ocean surface winds. J. Amer. Statist. Assoc. 96 382–397.
  • Wilson, A., Rappold, A. G., Neas, L. M. and Reich, B. J. (2014). Modeling the effect of temperature on ozone-related mortality. Ann. Appl. Stat. 8 1728–1749.
  • World Health Organization (2005). WHO Air Quality Guidelines for Paticularte Matter, Ozone, Nitrogen Dioxide and Sulfur Dioxide–Global Update (WHO/SDE/PHE/OEH/06.02), World Health Organization.
  • Xu, Y., Vizuete, W. and Serre, M. (2012). Characterization of air quality ozone model performance using land use regression model: An application in exposure assessment for epidemology studies. In The 11th Annual CMAS Conference, Chapel Hill, NC.
  • Zhang, H. and Wang, Y. (2010). Kriging and cross-validation for massive spatial data. Environmetrics 21 290–304.

Supplemental materials

  • A simulation study and some useful animations. Because the finite-sample properties of the estimators presented in Section 4.2 are usually unknown, a Monte Carlo simulation study is performed to investigate the statistical properties, such as unbiasedness, of the estimators. In addition, some useful animations are also provided to illustrate the proposed modeling approach.