A Intrusion Detection System for Web-Based Attacks Using IBM Watson
Keywords:Web applications, cyber attacks, intrusion attack, machine learning, classification, IBM Watson
The internet and web applications have been growing steadily and together with the increasing number of cyber attacks. These attacks are carried out through requests that are considered normal or abnormal (attack requests). Therefore, an intrusion attack can be considered as a classification problem. Machine learning algorithms are used as a way to train models to classify these requests in order to increase the security of web systems. The data used to carry out the training and tests in this work come from the CSIC 2010 dataset. The J48, Naive Bayes, OneR, Random Forest and IBM Watson LGBM algorithms were tested. The metrics used were t-rate, precision, recall and f measure. The results showed that the algorithm used by the Watson tool (LGBM) was the one that did the best in all metrics when compared to the other algorithms in the literature.
M. Husák, J. Komárková, E. Bou-Harb, and P. Čeleda, “Survey of attack projection, prediction, and forecasting in cyber security,” IEEE Communications Surveys Tutorials, vol. 21, no. 1, pp. 640–660, 2019.
M. Abushwereb, M. Mustafa, M. Al-kasassbeh, and M. Qasaimeh, Attack based DoS attack detection using multiple classifier, 01 2020.
D. Balzarotti, M. Cova, V. V. Felmetsger, and G. Vigna, “Multi-module vulnerability analysis of web-based applications,” in Proceedings of the 14th ACM Conference on Computer and Communications Security, ser. CCS ’07.
New York, NY, USA: Association for Computing Machinery, 2007, p. 25–35. [Online]. Available: https://doi.org/10.1145/1315245.1315250  A. Amjad, T. Alyas, U. Farooq, and M. Tariq, “Detection and mitigation of ddos attack in cloud computing using machine learning algorithm,” ICST Transactions on Scalable Information Systems, vol. 6, p. 159834, 07 2018.
A. Kott, Towards Fundamental Science of Cyber Security, 06 2014, vol. 55, pp. 1–13.
Y. Ding and Y. Zhai, “Intrusion detection system for nsl-kdd dataset using convolutional neural networks,” ser. CSAI ’18. New York, NY, USA: Association for Computing Machinery, 2018, p. 81–85. [Online]. Available: https://doi.org/10.1145/3297156.3297230
Y. Hamid, M. Sugumaran, and L. Journaux, “Machine learning techniques for intrusion detection: A comparative analysis,” in Proceedings of the International Conference on Informatics and Analytics, ser. ICIA-16. New York, NY, USA: Association for Computing Machinery, 2016. [Online]. Available: https://doi.org/10.1145/2980258.2980378
P. Tungjaturasopon and K. Piromsopa, “Performance analysis of machine learning techniques in intrusion detection,” in Proceedings of the 2018 VII International Conference on Network, Communication and Computing, ser. ICNCC 2018. New York, NY, USA: Association for Computing Machinery, 2018, p. 6–10. [Online]. Available: https://doi.org/10.1145/3301326.3301335
I.B.M. Corporation. (2020) Introdução ao ibm cloud. Acessado em: 29/10/2020. [Online]. Available:
S. Packowski and A. Lakhana, “Using ibm watson cloud services to build natural language processing solutions to leverage chat tools,” in Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, ser. CASCON ’17. USA: IBM Corp., 2017, p. 211–218.
P. Lak, C. Kavaklioglu, M. Sadat, M. Petitclerc, G. Wills, A. Miran-skyy, and A. B. Bener, “A probabilistic approach for modelling user preferences in recommender systems: A case study on ibm watson analytics,” in Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, ser. CASCON ’17. USA: IBM Corp., 2017, p. 38–47.
P. Lak, M. Sadat, C. J. Barrelet, M. Petitclerc, A. Miranskyy, C. Statchuk, and A. B. Bener, “Preliminary investigation on user interaction with ibm watson analytics,” in Proceedings of the 26th Annual International Conference on Computer Science and Software Engineering, ser. CASCON ’16. USA: IBM Corp., 2016, p. 218–225.
G. M. Carmen Torrano Giménez, Alejandro Pérez Villegas. (2010) Http dataset csic 2010. Acessado em: 20/10/2020. [Online]. Available: https://www.isi.csic.es/dataset/
U. of New Brunswick. Nsl-kdd dataset. Acessado em: 20/10/2020. [Online]. Available: https://www.unb.ca/cic/datasets/nsl.html
I. B. M. Corporation. (2020) Watson knowledge studio. Acessado em: 29/10/2020. [Online]. Available: https://www.ibm.com/br-pt/cloud/watson-knowledge-studio
I. Jemal, O. Cheikhrouhou, H. Hamam, and A. Mahfoudhi, “Sql injection attack detection and prevention techniques using machine learning,” International Journal of Applied Engineering Research, pp. 569–580, 01 2020.
U. of Waikato. Weka: The workbench for machine learning. Acessado em: 18/11/2020. [Online]. Available: https://www.cs.waikato.ac.nz/ml/weka/
G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” in Proceedings of the 31st International Conference on Neural
Information Processing Systems, ser. NIPS’17. Red Hook, NY, USA: Curran Associates Inc., 2017, p. 3149–3157.
J. R. Quinlan, C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., 1993.
“An essay towards solving a problem in the doctrine of chances. by the late rev. mr. bayes, f. r. s. communicated by mr. price, in a letter to john canton, a. m. f. r. s,” Philosophical Transactions of the
Royal Society of London, vol. 53, pp. 370–418, Dec. 1763. [Online]. Available: https://doi.org/10.1098/rstl.1763.0053
R. C. Holte, “Very simple classification rules perform well on most commonly used datasets,” pp. 63–90, 1993. [Online]. Available: https://webdocs.cs.ualberta.ca/ holte/Publications/simple rules.pdf
T. K. Ho, “Random decision forests,” ser. ICDAR ’95. USA: IEEE Computer Society, 1995, p. 278.
CTAN, “The comprehensive archive network,” 2009. [Online]. Available: http://www.ctan. org
L. Rendell and R. Seshu, “Learning hard concepts through constructive induction: Framework and rationale,” Computational Intelligence, vol. 6, no. 4, pp. 247–270, 1990.
L. Breiman, “Random forests,” Machine learning, vol. 45, no. 25, pp. 5–32, 2001.
P. Angelo and A. Drummond, “A survey of random forest based methods for intrusion detection systems,” ACM Computing Surveys, vol. 51, no. 26, 05 2018.