The Annals of Applied Statistics

Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams

Vishesh Karwa, Aleksandra B. Slavković, and Eric T. Donnell

Full-text: Open access


The research questions that motivate transportation safety studies are causal in nature. Safety researchers typically use observational data to answer such questions, but often without appropriate causal inference methodology. The field of causal inference presents several modeling frameworks for probing empirical data to assess causal relations. This paper focuses on exploring the applicability of two such modeling frameworks—Causal Diagrams and Potential Outcomes—for a specific transportation safety problem. The causal effects of pavement marking retroreflectivity on safety of a road segment were estimated. More specifically, the results based on three different implementations of these frameworks on a real data set were compared: Inverse Propensity Score Weighting with regression adjustment and Propensity Score Matching with regression adjustment versus Causal Bayesian Network. The effect of increased pavement marking retroreflectivity was generally found to reduce the probability of target nighttime crashes. However, we found that the magnitude of the causal effects estimated are sensitive to the method used and to the assumptions being violated.

Article information

Ann. Appl. Stat., Volume 5, Number 2B (2011), 1428-1455.

First available in Project Euclid: 13 July 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Causal inference potential outcomes causal Bayesian networks observational studies transportation safety nighttime crash data


Karwa, Vishesh; Slavković, Aleksandra B.; Donnell, Eric T. Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams. Ann. Appl. Stat. 5 (2011), no. 2B, 1428--1455. doi:10.1214/10-AOAS440.

Export citation


  • Bahar, G., Masliah, M., Erwin, T., Tan, E. and Hauer, E. (2006). Pavement marking materials and markers: Real-world relationship between retroreflectivity and safety over time. NCHRP Web Only Document 92, Transportation Research Board, National Research Council, Washington DC.
  • Bang, H. and Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics 61 962–973.
  • Bureau of Transportation Statistics, U.S. (2007). National transportation statistics. Technical report.
  • Cole, S. R. and Frangakis, C. E. (2009). The consistency statement in causal inference: A definition or an assumption? Epidemiology 20 3–5.
  • Cowell, R. (1998). Introduction to inference for Bayesian networks. In Proceedings of the NATO Advanced Study Institute on Learning in Graphical Models 9–26. Kluwer Academic, Norwell, MA.
  • Davis, G. A. (2000). Accident reduction factors and causal inference in traffic safety studies: A review. Accident Analysis & Prevention 32 95–109.
  • Davis, G. A. (2004). Possible aggregation biases in road safety research and a mechanism approach to accident modeling. Accident Analysis & Prevention 36 1119–1127.
  • Dehejia, R. H. and Wahba, S. (1999). Causal effects in nonexperimental studies: Reevaluating the evaluation of training programs. J. Amer. Statist. Assoc. 94 1053–1062.
  • Donnell, E. T., Karwa, V. and Sathyanarayanan, S. (2009). Analysis of effects of pavement marking retroreflectivity on traffic crash frequency on highways in North Carolina. Transportation Research Record: Journal of the Transportation Research Board 2103 50–60.
  • Eells, E. (1991). Probabilistic Causality. Cambridge Univ. Press, Cambridge.
  • Fienberg, S. E. and Sfer, A. M. (2006). Randomization, models, and the estimation of causal effects. Unpublished manuscript.
  • Hartemink, A. J. (2005). Banjo: Bayesian network inference with Java objects. Software package, available at
  • Heckerman, D. (2008). A tutorial on learning with Bayesian networks. In Innovations in Bayesian Networks (D. Holmes and L. Jain, eds.) 33–82. Springer, Berlin.
  • Heckerman, D., Geiger, D. and Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20 197–243.
  • Heckman, J. J., Ichimura, H. and Todd, P. (1998). Matching as an econometric evaluation estimator. Rev. Econom. Stud. 65 261–294.
  • Hernan, M. A. and Robins, J. M. (2006). Estimating causal effects from epidemiological data. J. Epidemiol. Community Health 60 578–586.
  • Hirano, K. and Imbens, G. W. (2001). Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and Outcomes Research Methodology 2 259–278.
  • Hirano, K. and Imbens, G. W. (2004). The propensity score with continuous treatments. In Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives (X.-L. Meng and A. Gelman, eds.) 73–84. Wiley, Chichester.
  • Hirano, K., Imbens, G. W. and Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71 1161–1189.
  • Hoeting, J., Adrian, D. M. and Volinsky, C. T. (1998). Bayesian model averaging. In Proceedings of the AAAI Workshop on Integrating Multiple Learned Models 77–83. AAAI Press.
  • Holland, P. W. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945–960.
  • Hong, G. and Raudenbush, S. W. (2005). Effects of kindergarten retention policy on children’s cognitive growth in reading and mathematics. Educational Evaluation and Policy Analysis 27 205–224.
  • Karwa, V. (2009). Safety effects of pavement marking retroreflectivity: An application of causal bayesian networks. Master’s thesis, Pennsylvania State Univ., University Park, PA.
  • Karwa, V. and Donnell, E. T. (2011). Predicting pavement marking retroreflectivity degradation using artificial neural networks: An exploratory analysis. Journal of Transportation Engineering 137 91–103.
  • Karwa, V., Slavković, A. and Donnell, E. T. (2011). Supplement to “Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams.” DOI: 10.1214/10-AOAS440SUPP.
  • Lauritzen, S. L. (1999). Causal inference from graphical models. In Complex Stochastic Systems 63–107. Chapman & Hall/CRC Press, Boca Raton, FL.
  • Lee, B. K., Lessler, J. and Stuart, E. A. (2009). Improving propensity score weighting using machine learning. Stat. Med. 29 337–346.
  • Lok, J., Gill, R., van der Vaart, A. and Robins, J. (2004). Estimating the causal effect of a time-varying treatment on time-to-event using structural nested failure time models. Statist. Neerlandica 58 271–295.
  • Madigan, D. and Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Amer. Statist. Assoc. 89 1535–1546.
  • Maldonado, G. and Greenland, S. (2002). Estimating causal effects. Int. J. Epidemiol. 31 422–429.
  • McCaffrey, D., Ridgeway, G. and Morral, A. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods 9 403–425.
  • Murphy, K. P. (2001). The Bayes net toolbox for MATLAB. Comput. Sci. Statist. 33 2001.
  • National Highway Traffic Safety Administration (2007). Traffic safety facts 2007. Technical report.
  • Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge Univ. Press, Cambridge.
  • Pearl, J. (2003). Statistics and causal inference: A review. Test 12 281–318.
  • Pearl, J. (2009). On a class of bias-amplifying covariates that endanger effect estimates. Technical report, Univ. California, Los Angeles.
  • Pearl, J. and Verma, T. S. (1991). A theory of inferred causation. In Principles of Knowledge Representation and Reasoning: Proceedings of the Second International Conference 11 441–452. Morgan Kaufmann, San Mateo, CA.
  • Reichenbach, H. (1956). The Direction of Time. Univ. California Press, Berkeley.
  • Ridgeway, G. (2007). Generalized boosted models: A guide to the GBM package. Available at
  • Ridgeway, G., McCaffrey, D. and Morral, A. (2006). Toolkit for weighting and analysis of nonequivalent groups: A tutorial for the twang package. RAND Corporation, Santa Monica, CA.
  • Robins, J. M. and Wasserman, L. (1999). On the impossibility of inferring causation from association without background knowledge. In Computation, Causation, and Discovery (C. Glymour and G. F. Cooper, eds.) 305–321. AAAI Press, Menlo Park, CA.
  • Rosenbaum, P. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34–58.
  • Rubin, D. B. (1990). Formal mode of statistical inference for causal effects. J. Statist. Plann. Inference 25 279–292.
  • Rubin, D. B. (1998). Estimation from nonrandomized treatment comparisons using subclassification on propensity scores. Ann. Internal Medicine 8 757–763.
  • Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc. 100 322–331.
  • Rubin, D. B. (2008). For objective causal inference, design trumps analysis. Ann. Appl. Statist. 2 808–840.
  • Rubin, D. B., Wang, X., Yin, L. and Zell, E. R. (2008). Bayesian causal inference: Approaches to estimating the effect of treating hospital type on cancer survival in Sweden using principal stratification. In Handbook of Applied Bayesian Analysis (T. O’Hagan and M. West, eds.). Oxford Univ. Press, Oxford.
  • Schafer, J. L. and Kang, J. (2008). Everage causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods 13 279–313.
  • Sfer, A. M. (2005). Randomization and causality. Technical report, Facultad de Ciencias Económicas, Universidad Nacional de Tucumán, San Miguel de Tucumán, Argentina.
  • Spirtes, P. and Glymour, C. (1991). An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 9 62–72.
  • Spirtes, P., Glymour, C. and Scheines, R. (2001). Causation, Prediction, and Search, 2nd ed. MIT Press, Cambridge, MA.
  • Stanton, N. A. and Salmon, P. M. (2009). Human error applied to driving: A generic driver error taxonomy and its implications for intelligent transport systems. Safety Science 47 227–237.
  • Tsamardinos, I., Brown, L. E. and Aliferis, C. F. (2006). The max–min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65 31–78.

Supplemental materials

  • Supplementary material: Supplement to “Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams”. This document contains additional details about the Matching and Inverse Propensity score estimators and the top ten graphs recovered by the graph learning algorithm.