Machine Learning Classification of Port Scanning and DDoS Attacks: A Comparative Analysis

Muhammad Aamir; Syed Sajjad Hussain Rizvi; Manzoor Ahmed Hashmani; Muhammad Zubair; Jawwad Ahmed

doi:10.22581/muet1982.2101.19

Muhammad Aamir Shaheed Zulfikar Ali Bhutto Institute of Science & Technology (SZABIST), Karachi, Pakistan
Syed Sajjad Hussain Rizvi Shaheed Zulfikar Ali Bhutto Institute of Science & Technology (SZABIST), Karachi, Pakistan .
Manzoor Ahmed Hashmani Department of Computer and Information Sciences, Centre for Research in Data Science (CERDAS), Universiti Teknologi PETRONAS, Seri Iskander, Malaysia.
Muhammad Zubair Iqra University, Karachi, Pakistan.
Jawwad Ahmed Usman Institute of Technology, Karachi, Pakistan.

DOI: https://doi.org/10.22581/muet1982.2101.19

Abstract

Cyber security is one of the major concerns of todayâ€™s connected world. For all the platforms of todayâ€™s communication technology such as wired, wireless, local and remote access, the hackers are present to corrupt the system functionalities, circumvent the security measures and steal sensitive information. Amongst many techniques of hackers, port scanning and Distributed Denial of Service (DDoS) attacks are very common. In this paper, the benefits of machine learning are taken into consideration for classification of port scanning and DDoS attacks in a mix of normal and attack traffic. Different machine learning algorithms are trained and tested on a recently published benchmark dataset (CICIDS2017) to identify the best performing algorithms on the data which contains more recent vectors of port scanning and DDoS attacks. The classification results show that all the variants of discriminant analysis and Support Vector Machine (SVM) provide good testing accuracy i.e. more than 90%. According to a subjective rating criterion mentioned in this paper, 9 algorithms from a set of machine learning experiments receive the highest rating (good) as they provide more than 85% classification (testing) accuracy out of 22 total algorithms. This comparative analysis is further extended to observe training performance of machine learning models through k-fold cross validation, Area Under Curve (AUC) analysis of the Receiver Operating Characteristic (ROC) curves, and dimensionality reduction using the Principal Component Analysis (PCA). To the best of our knowledge, a comprehensive comparison of various machine learning algorithms on CICIDS2017 dataset is found to be deficient for port scanning and DDoS attacks while considering such recent features of attack.