False Positive Identification in Intrusion Detection Using XAI
Keywords:
Intrusion detection, machine learning, explainability, XAI, false positive rateAbstract
With the increase in the use of the Internet to access sensitive data, intrusion detection has become an essential security measure. The evolution that took place in Artificial Intelligence in the last decades, notably in Machine Learning techniques, combined with the availability of network traffic datasets, opened a vast field for research and development in Intrusion Detection Systems based on anomalies. Published studies on this subject, nonetheless, are unanimous in stating that this type of detection is more prone to the occurrence of false positives. In order to mitigate this problem, we propose a more effective method of identifying them, compared to using only the algorithm’s confidence. For this, we hypothesize that the relevance given by the algorithm to certain attributes may be related to whether the detection is true or false. The method consists, therefore, in obtaining these features relevance through eXplainable Artificial Intelligence (XAI) and, together with a confidence measure, identifying detections that are more likely to be false. By using the LYCOS-IDS2017 dataset, it is possible to eliminate more than 65% of the total false positives, with a loss of only 0.38% of true positives. Conversely, by using only a confidence measure, the elimination of false positives is approximately just 50%, with a loss of 0.42% of true positives.
Downloads
References
A. M. Riyad, M. Ahmed, and H. Almistarihi, “A quality framework to improve ids performance through alert post-processing,” International Journal of Intelligent Engineering and Systems, 2019.
R. Alshammari, S. Sonamthiang, M. Teimouri, and D. Riordan, “Using neuro-fuzzy approach to reduce false positive alerts,” in Fifth Annual Conference on Communication Networks and Services Research (CNSR’07), pp. 345–349, 2007.
A. Nisioti, A. Mylonas, P. D. Yoo, and V. Katos, “From intrusion detection to attacker attribution: A comprehensive survey of unsupervised methods,” IEEE Communications Surveys Tutorials, vol. 20, no. 4, pp. 3369–3388, 2018.
K. A. Scarfone and P. M. Mell, “Sp 800-94. guide to intrusion detection and prevention systems (idps),” tech. rep., National Institute of Standards & Technology, Gaithersburg, MD, USA, 2007.
R. Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in 2010 IEEE Symposium on Security and Privacy, pp. 305–316, 2010.
E. K. Viegas, A. O. Santin, and L. S. Oliveira, “Toward a reliable anomaly-based intrusion detection in real-world environments,” Computer Networks, vol. 127, pp. 200–216, 2017.
Internet Steering Committee project in Brazil, “Total data traffic on the brazilian internet,” 2022. https://ix.br/agregado/. Accessed on: Nov. 11, 2022.
D. L. Marino, C. S. Wickramasinghe, and M. Manic, “An adversarial approach for explainable ai in intrusion detection systems,” in IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society, pp. 3237–3243, 2018.
S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems 30 (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), pp. 4765–4774, Curran Associates, Inc., 2017.
L. S. Shapley, A Value for n-Person Games, pp. 307–317. Princeton University Press, 1953.
M. T. Ribeiro, S. Singh, and C. Guestrin, “"why should i trust you?": Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, (New York, NY, USA), p. 1135–1144, Association for Computing Machinery, 2016.
A. Shrikumar, P. Greenside, A. Shcherbina, and A. Kundaje, “Not just a black box: Learning important features through propagating activation differences,” ArXiv, vol. abs/1605.01713, 2016.
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PLOS ONE, vol. 10, pp. 1–46, 07 2015.
MIT Lincoln Laboratory, “1999 darpa intrusion detection evaluation dataset,” 1999. https://www.ll.mit.edu/r-d/datasets/1999-darpa-intrusion-detection-evaluation-dataset. Accessed on: Nov. 16, 2022.
G. P. Spathoulas and S. K. Katsikas, “Reducing false positives in intrusion detection systems,” Computers & Security, vol. 29, no. 1, pp. 35–44, 2010.
P. Pitre, A. Gandhi, V. Konde, R. Adhao, and V. Pachghare, “An intrusion detection system for zero-day attacks to reduce false positive rates,” in 2022 International Conference for Advancement in Technology (ICONAT), pp. 1–6, 2022.
H. Kim, Y. Lee, E. Lee, and T. Lee, “Cost-effective valuable data detection based on the reliability of artificial intelligence,” IEEE Access, vol. 9, pp. 108959–108974, 2021.
C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” 2017.
M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, “A survey of network-based intrusion detection data sets,” Computers & Security, vol. 86, pp. 147–167, 2019.
G. Engelen, V. Rimmer, and W. Joosen, “Troubleshooting an intrusion detection dataset: the cicids2017 case study,” in 2021 IEEE Security and Privacy Workshops (SPW), pp. 7–12, 2021.
A. Rosay, E. Cheval, F. Carlier, and P. Leroux, “Network Intrusion Detection: A Comprehensive Analysis of CIC-IDS2017,” in 8th International Conference on Information Systems Security and Privacy, pp. 25–36, SCITEPRESS - Science and Technology Publications, Feb. 2022.
M. Ring, A. Dallmann, D. Landes, and A. Hotho, “IP2Vec: Learning similarities between ip addresses,” in 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 657–666, 2017.