A clustering-based method for outlier detection under concept drift

  • Mahjabeen Tahir Department of Computer Science and Information Technology, University Putra Malaysia (UPM), Serdang,43400, Selangor, Malaysia
  • Azizol Abdullah Department of Computer Science and Information Technology, University Putra Malaysia (UPM), Serdang,43400, Selangor, Malaysia
  • Nur Izura Udzir Department of Computer Science and Information Technology, University Putra Malaysia (UPM), Serdang,43400, Selangor, Malaysia
  • Khairul Azhar Kasmiran Department of Computer Science and Information Technology, University Putra Malaysia (UPM), Serdang,43400, Selangor, Malaysia

Abstract

The ongoing challenge against network security issues persists, necessitating the exploration of alternative approaches. Anomaly-based strategies, diverging from traditional signature-based methods, gain popularity for their effectiveness in detecting new attacks. However, accurately defining normal network behavior becomes increasingly challenging due to data fluctuations. This study introduces a two-step process for recognizing evolving anomalies in streaming network data. Initially, clusters are updated incrementally upon new data arrival (the updating phase). Subsequently, anomalies are identified by discerning outer and inner outliers using minimum and maximum density thresholds. A buffer concept temporarily stores incoming data to prevent misclassification of normal network samples as anomalies. Performance evaluation in Python 3 assesses the impact on detection rate, false positives, and accuracy using two popular streaming datasets (NSL-KDD and UNSWNB-15). The algorithm achieves notable results, with a detection rate of 99.12% on UNSWNB-15 and a 7.9% false positive rate on NSL-KDD, marking significant progress. The proposed approach CADSD (Cluster-based Anomaly Detection with Streaming Data), operates in real-time without pre-training. However, challenges may arise from assuming the majority of data comprises normal instances, particularly during sudden spikes in attack data, potentially diminishing algorithm effectiveness. Nonetheless, the method shows the potential to enhance network security by promptly identifying emerging anomalies in real-time streaming data. The incorporation of a buffer concept to prevent the misidentification of normal network samples as anomalies underscores the innovative nature of this approach.

Published
Jul 1, 2024
How to Cite
TAHIR, Mahjabeen et al. A clustering-based method for outlier detection under concept drift. Mehran University Research Journal of Engineering and Technology, [S.l.], v. 43, n. 3, p. 205-218, july 2024. ISSN 2413-7219. Available at: <https://publications.muet.edu.pk/index.php/muetrj/article/view/3269>. Date accessed: 16 july 2024. doi: http://dx.doi.org/10.22581/muet1982.3269.
Section
Articles
This is an open Access Article published by Mehran University of Engineering and Technolgy, Jamshoro under CCBY 4.0 International License