Open Access
March 2022 Sequential modeling, monitoring, and forecasting of streaming web traffic data
Kaoru Irie, Chris Glynn, Tevfik Aktekin
Author Affiliations +
Ann. Appl. Stat. 16(1): 300-325 (March 2022). DOI: 10.1214/21-AOAS1505

Abstract

In this paper we introduce strategies for modeling, monitoring, and forecasting sequential web traffic data using flows from the Fox News website. In our analysis we consider a family of Poisson-gamma state space (PGSS) models that can accurately quantify the uncertainty exhibited by web traffic data, can provide fast sequential monitoring and prediction mechanisms for high frequency time intervals, and are computationally feasible when structural breaks are present. As such, we extend the family of PGSS models to include the state augmented (sa-)PGSS model whose state evolution structure is flexible and responsive to sudden changes. Such adaptability is achieved by augmenting the state vector of the PGSS model with an additional state variable for a time-varying discount factor. We develop an efficient particle-based estimation procedure that is suitable for sequential analysis, allowing us to estimate dynamic state variables and static parameters via closed-form conditional sufficient statistics. We compare the performance of the PGSS family of models against viable alternatives from the literature and argue that, especially in the presence of structural breaks, our proposed approach yields superior sequential model fit and predictive performance while preserving computational feasibility. We provide additional insights by designing a simulation study that mimics potential web traffic data patterns.

Funding Statement

The first author was supported by the Japan Society for the Promotion of Science (JSPS KAKENHI) grant number 17K17659.

Acknowledgments

The authors would like to thank the Editor, the Associate Editor, and the anonymous referees for helpful comments.

Citation

Download Citation

Kaoru Irie. Chris Glynn. Tevfik Aktekin. "Sequential modeling, monitoring, and forecasting of streaming web traffic data." Ann. Appl. Stat. 16 (1) 300 - 325, March 2022. https://doi.org/10.1214/21-AOAS1505

Information

Received: 1 March 2019; Revised: 1 April 2021; Published: March 2022
First available in Project Euclid: 28 March 2022

MathSciNet: MR4400511
zbMATH: 1498.62310
Digital Object Identifier: 10.1214/21-AOAS1505

Keywords: count data , high frequency , Poisson-gamma , sequential Monte Carlo , Web traffic

Rights: Copyright © 2022 Institute of Mathematical Statistics

Vol.16 • No. 1 • March 2022
Back to Top