Detection of Obfuscation Malware: A Federated Transfer Learning-based Approach with Hybrid Neural Networks
Keywords:
Federated learning, Transfer learning, Federated transfer learning, malware, cyberattacks, neural network, TensorFlow.Abstract
The increase in the incidence of cyberattacks, especially through the use of complex mechanisms for exploiting vulnerabilities, such as malware obfuscation, has driven the adoption of Machine Learning (ML) techniques in cybersecurity. This study investigates the application of Federated Learning (FL), a decentralized approach that preserves data privacy and overcomes challenges in transferring large volumes of information. Two labeled datasets were used, CIC-MalMem-2022 and Malware Detection Dataset, along with two FL frameworks, Flower Framework and TensorFlow Federated. A decentralized model based on a Linear Neural Network (LNN) with federated averaging (FedAvg) was compared to a centralized model using a Recurrent Neural Network (RNN) in supervised binary classifications of malware. The results demonstrate high accuracy across all analyzed scenarios, highlighting the outcomes obtained in centralized training for the CIC-Malware dataset, achieving an accuracy of 0.99, precision of 1.0, and recall of 0.99, emphasizing the potential of FL in cybersecurity.
Downloads
References
T. Carrier, P.Victor,A. Tekeoglu, and A.H. Lashkari,” Detecting obfuscated malware using memory feature engineering,” in Proc. 8th Int. Conf. Inf. Syst. Security Privacy (ICISSP), 2022, pp. 177–188.
DOI: 10.5220/0010770300003115.
H. Huseynov, K. Kourai, T. Saadawi, and O. Igbe, ”Virtual machine introspection for anomaly-based keylogger detection,” in Proc. IEEE 21st Int. Conf. High Perform. Switching Routing (HPSR), May 2020, pp. 1–6.
DOI: 10.1109/HPSR48589.2020.9098970.
S. Homayoun, A. Dehghantanha, M. Ahmadzadeh, S. Hashemi, and R. Khayami, ”Know abnormal, find evil: Frequent pattern mining for ransomware threat hunting and intelligence,” IEEE Trans. Emerg. Topics Comput., vol. 8, no. 2, pp. 341–351, Apr. 2020.
DOI: 10.1109/TETC.2017.2768038.
Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, ”Electron spectroscopy studies on magneto-optical media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
DOI: 10.1109/TJMJ.1987.4563869.
M. Dener, G. Ok, and A. Orman, ”Malware detection using memory analysis data in big data environment,” Appl. Sci., vol. 12, no. 17, p. 8604, Aug. 2022.
DOI: 10.3390/app12178604.
C.-W. Chen, C.-H. Su, K.-W. Lee, and P.-H. Bair, ”Malware family classification using active learning by learning,” in Proc. 22nd Int. Conf. Adv. Commun. Technol. (ICACT), Feb. 2020, pp. 590–595.
DOI: 10.23919/ICACT48636.2020.9061480.
Kairouz, Peter et al. Advances and open problems in federated learning. Foundations and trends® in machine learning, v. 14, n. 1–2, p. 1-210, 2021. https://doi.org/10.48550/arXiv.1912.04977
Estudos, R. T. d. I. d. S. n. B. Centro de. Incidentes notificados ao cert. br, 2025.
Symantec (Broadcom). (2022). The Ransomware Threat Landscape: What to Expect in 2022. Dispon´ıvel em: https://www.symantec.broadcom.com/
S. P. Karimireddy, et al., ”Federated Learning Showdown: The Comparative Analysis of Federated Learning Frameworks,” 2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC), 2023.
DOI: 10.1109/FMEC57357.2023.00012.
D. Cevallos-Salas, et al., ”Obfuscated Privacy Malware Classifiers based on Memory Dumping Analysis,” IEEE Access, 2024.
DOI: 10.1109/AC CESS.2024.3045678.
R. Lazzarini, H. T. Tianfield, and V. Charissis, ”Federated learning for IoT intrusion detection,” AI, vol. 4, no. 3, pp. 509-530, 2023.
DOI: 10.3390/ai4030032.
Rajesh, Lochana Telugu et al. Give and take: Federated transfer learning for industrial IoT network intrusion detection. In: 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 2023. p. 2365-2371. https://doi.org/10.48550/arXiv.2310.07354
Nobakht, M., Javidan, R., Pourebrahimi, A. (2024). SIM FED: Secure IoT Malware Detection Model with Federated Learning. Computers and Electrical Engineering, https://doi.org/10.1016/j.compeleceng.2024.109139 109139.
Hossain, Md Alamgir; ISLAM, Md Saiful. Enhanced Detection of Obfuscated Malware in Memory Dumps: A Machine Learning Approach for Advanced Cybersecurity. Cybersecurity, v. 7, n. 1, p. 16, 2024.
https://doi.org/10.1186/s42400-024-00205-z
HE, Xinyue et al. Deep-HMD: A Deep Learning-based Framework for Accurate and Efficient Hardware-based Zero-day Malware Detection. Computer Security, 2022.
DOI: 10.1016/j.cose.2022.102869
Wang, S., Zhang, Y., Chen, J. (2022). Malware Detection Using Deep Transfer Learning Based on Bytecode Images. 2022 IEEE International Conference on Intelligence and Security Informatics (ISI), 1–6. https://doi.org/10.1109/ISI55183.2022.00013
Al-Dujaili, A., Huang, Y., Hemberg, E., O’Reilly, U. M. (2021). Adversarial deep learning for robust detection of binary encoded malware. Machine Learning, 110, 251–279. https://doi.org/10.1007/s10994-020 05909-6
Chen, S. et al. An efficient intrusion detection method based on federated transfer learning. In: IEEE. 2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT). [S.l.], 2024. p. 1–4.
DOI: 10.1109/AICIT62434.2024.10730009.
McMahan, Brendan, et al. ”Communication-efficient learning of deep networks from decentralized data,” Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. PMLR, 2017.
DOI: 10.48550/arXiv.1602.05629.
Liu, Yang, et al. ”Vertical federated learning: Concepts, advances, and challenges,” IEEE Transactions on Knowledge and Data Engineering (2024). DOI: 10.1109/TKDE.2022.3141672.
K. Bonawitz, et al., ”Practical secure aggregation for privacy-preserving machine learning,” in Proc.
DOI: 10.1145/3133956.3133982
Shokri, Reza, and Vitaly Shmatikov. ”Privacy-preserving deep learning,” Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 2015.
DOI: 10.1145/2810103.2813687.
S. Yang, et al., ”Parallel distributed logistic regression for vertical federated learning without third-party coordinator,” arXiv preprint arXiv:1911.09824, 2019.
DOI 10.48550/arXiv.1911.09824.
Zhang, Yongxuan, and Jun Yan. ”Domain-adversarial transfer learning for robust intrusion detection in the smart grid.” 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). IEEE, 2019.
DOI: 10.1109/Smart GridComm. 2019.8909782.
Wu, Peilun, Hui Guo, and Richard Buckland. ”A transfer learning approach for network intrusion detection,” 2019 IEEE 4th International Conference on Big Data Analytics (ICBDA). IEEE, 2019.
DOI: 10.1109/ICBDA.2019.8713205.
Bozinovski, Stevo; Fulgosi, Ante. The influence of pattern similarity and transfer learning upon training of a base perceptron B2. In: Proceedings of symposium informatica. 1976. p. 121-126. https://doi.org/10.31449/inf.v44i3.2828
Pratt, Daniel D. ”Andragogy after twenty-five years,” New directions for adult and continuing education 57.57 (1993): 15-23.
DOI:10.1002/ace.36719935704
NVIDIA, ”NVFlare: NVIDIA Federated Learning Application Runtime Environment,” [Online]. Available: https://nvflare.readthedocs.io. [Accessed: 2024-12-01].
TensorFlow Federated Authors, ”TensorFlow Federated: Machine learning on decentralized data,” [Online]. Available: https://www.tensorflow.org/federated. [Accessed: 2024-12-02].
IBM Research, ”IBM Federated Learning Community Edition,” [Online]. Available: https://ibmfl.mybluemix.net. [Accessed: 2024-12-04].
OpenMined, ”PySyft: A Library for Secure and Private Machine Learn ing,” [Online]. Available: https://github.com/OpenMined/PySyft. [Accessed: 2024-11-13].
Flower Authors, ”Flower: A Friendly Federated Learning Framework,” [Online]. Available: https://flower.dev. [Accessed: 2024-12-02].
DOI: 10.48550/arXiv.2007.14390.
Banala, Subash. ”DevOps Essentials: Key Practices for Continuous Integration and Continuous Delivery,” International Numeric Journal of Machine Learning and Robots 8.8 (2024): 1-14.
Canadian Centre for Cyber Security (CCCS) and Canadian Institute for Cybersecurity (CIC), ”CCCS-CIC-AndMal-2020 Dataset,” 2020. [Online]. Available: https://www.unb.ca/cic/datasets/andmal2020.html. [Accessed: 2024-09-21].
NSaravana. ”Detecc¸˜ao de malware—Kaggle,” 2018. [Online]. Available: https://www.kaggle.com/datasets/nsaravana/malware-detection
Mining, What Is Data. Data mining: Concepts and techniques. Morgan Kaufinann, v. 10, n. 559-569, p. 4, 2006.
García´,Salvador et al. Data preprocessing in data mining. Cham, Switzerland: Springer International Publishing, 2015.
J. Tan, et al., ”A critical look at the current train/test split in machine learning,” arXiv preprint arXiv:2106.04525, 2021.
DOI: 10.48550/arXiv.2106.04525
V. Gazeau, K. Gupta, and M. K. An, ”Advancements of Machine Learn ing in Malware and Intrusion Detections”. 2024 International Conference on Computer, Information and Telecommunication Systems (CITS), Girona, Spain, 2024, pp. 1-7.
DOI: 10.1109/CITS61189.2024.10608018.
R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, and S. Venkatraman, ”Robust Intelligent Malware Detection Using Deep Learn ing,” IEEE Access, vol. 7, pp. 46717-46738, 2019.
DOI: 10.1109/AC CESS.2019.2906934.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MI