We propose a novel and efficient method, that we shall call TopRank in the following paper, for detecting change-points in high-dimensional data. This issue is of growing concern to the network security community since network anomalies such as Denial of Service (DoS) attacks lead to changes in Internet traffic. Our method consists of a data reduction stage based on record filtering, followed by a nonparametric change-point detection test based on U-statistics. Using this approach, we can address massive data streams and perform anomaly detection and localization on the fly. We show how it applies to some real Internet traffic provided by France-Télécom (a French Internet service provider) in the framework of the ANR-RNRT OSCAR project. This approach is very attractive since it benefits from a low computational load and is able to detect and localize several types of network anomalies. We also assess the performance of the TopRank algorithm using synthetic data and compare it with alternative approaches based on random aggregation.
References
Abry, P., Borgnat, P. and Dewaele, G. (2007). Sketch based anomaly detection, identification and performance evaluation. In, Proceedings of the 2007 International Symposium on Applications and the Internet Workshops 80. IEEE Computer Society, Washington, DC.
Basseville, M. and Nikiforov, I. V. (1993)., Detection of Abrupt Changes: Theory and Applications. Prentice-Hall, Englewood Cliffs, NJ.
Bickel, P. J. and Doksum, K. A. (1976)., Mathematical Statistics. Holden Day, San Francisco, CA.
Mathematical Reviews (MathSciNet):
MR443141
Brodsky, B. E. and Darkhovsky, B. S. (1993)., Nonparametric Methods in Change-Point Problems. Kluwer Academic, Dordrecht.
Csörgo, M. and Horvath, L. (1997)., Limit Theorems in Change-Point Analysis. Wiley, New York.
Gehan, E. (1965). A generalized Wilcoxon test for comparing arbitrarily single censored samples., Biometrika 52 203–223.
Mathematical Reviews (MathSciNet):
MR207130
Gombay, E. and Liu, S. (2000). A nonparametric test for change in randomly censored data., Can. J. Statist. 28 113–121.
Krishnamurthy, B., Subhabrata, S., Zhang, Y. and Chen, Y. (2003). Sketch-based change detection: Methods, evaluation and applications. In, Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement 234–247. ACM, New York.
Lakhina, A., Crovella, M. and Diot, C. (2004). Diagnosing network-wide traffic anomalies. In, Proceedings of the 2004 Conference of Applications, Technologies, Architectures, and Protocols for Computer Comunnications 219–230. ACM, New York.
Lévy-Leduc, C., Benmammar, B. and Roueff, F. (2008). Toprank algorithm. Registered at the “Agence pour la Protection des Programmes” (http://app.legalis.net/)., IDDN.FR.001.100004.000.S.P.2008.000.20700.
Li, X., Bian, F., Crovella, M., Diot, C., Govindan, R., Iannaccone, G. and Lakhina, A. (2006). Detection and identification of network anomalies using sketch subspaces. In, Proceedings of SIGCOMM 147–152. ACM, New York.
Liu, S. (1998). Nonparametric tests for change-point problems with random censorship. Ph.D. thesis, Dept. of Mathematical Sciences, Univ., Alberta.
Mantel, N. (1967). Ranking procedures for arbitrarily restricted observations., Biometrics 23 65–78.
Mathematical Reviews (MathSciNet):
MR221717
Midodzi, W. K. (2001). Nonparametric sequential detection in the distribution of randomly censored data. Ph.D. thesis, Dept. of Mathematical Sciences, Univ., Alberta.
Page, E. S. (1954). Continuous inspection schemes., Biometrika 41 100–115.
Mathematical Reviews (MathSciNet):
MR88850
Paxson, V. (1999). Bro: A system for detecting network intruders in real-time., Computer Network 31 2435–2463.
Roesch, M. (1999). Snort: Lightweight intrusion detection for networks. In, Proceedings of LISA’99 229–238. USENIX Association, Berkeley, CA.
Salem, O., Vaton, S. and Gravey, A. (2007). A novel approach for anomaly detection over high-speed networks. In, Proceedings of EC2ND.
Siris, A. and Papagalou, F. (2004). Application of anomaly detection algorithms for detecting SYN flooding attacks., Computer Communications 29 1433–1442.
Tartakovsky, A., Rozovskii, B., Blazek, R. and Kim, H. (2006a). Detection of intrusion in information systems by sequential change-point methods., Statist. Methodol. 3 252–340.
Tartakovsky, A., Rozovskii, B., Blazek, R. and Kim, H. (2006b). A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods., IEEE Transactions on Signal Processing 54 3372–3382.
Thorup, M. and Zhang, Y. (2004). Tabulation based 4-universal hashing with applications to second moment estimation. In, Proceedings of the Fifteenth ACM-SIAM Symposium on Discrete Algorithms 615–624. ACM, New York.
Wang, H., Zhang, D. and Shin, G. (2002). Detecting SYN flooding attacks. In, Proceedings of INFOCOM 3 1530–1539.