Expert Selection for Wordlist-Based DGA Detection

Reynier Leyva La O; Carlos A. Catania; Rodrigo Gonzalez

Authors

Reynier Leyva La O GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional, Mendoza, Argentina, and also with the National Scientific and Technical Research Council (CONICET), Godoy Cruz 2290, C1425FQB, CABA, Argentina https://orcid.org/0000-0003-3975-1437
Carlos A. Catania LABSIN, Facultad de Ingeniería, Universidad Nacional de Cuyo, Mendoza, Argentina, and also with the National Scientific and Technical Research Council (CONICET), Godoy Cruz 2290, C1425FQB, CABA, Argentina https://orcid.org/0000-0002-1749-310X
Rodrigo Gonzalez GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional, Mendoza, Argentina https://orcid.org/0000-0002-1939-0534

Keywords:

Domain Generation Algorithms, Expert Selection, Cybersecurity, Real-time Detection, Model Comparison

Abstract

Domain Generation Algorithms (DGAs) have evolved beyond traditional pseudorandom patterns, with wordlist-based variants generating linguistically coherent domains that evade conventional detection methods. While previous research has primarily focused on generalist detection approaches across multiple DGA types, systematic expert model selection specifically targeting wordlist-based variants remains largely unexplored. This work addresses expert model selection for wordlist-based DGA detection, where expert models refer to specialized architectures trained exclusively on specific DGA categories. We conduct systematic evaluation of seven candidate models across transformer, convolutional neural network (CNN), and traditional machine learning approaches. Models were trained on a balanced dataset of 160,000 domains spanning eight wordlist-based DGA families and evaluated using a rigorous two-phase protocol that measures both performance on training families and generalization to previously unseen variants. Our comparative analysis identifies fine-tuned ModernBERT as the optimal expert model, achieving 86.7% F1-score on known families while maintaining 80.9% performance on unknown families with 26ms inference time on NVIDIA Tesla T4 GPUs, enabling processing of approximately 38 domains per second. The study validates that domain-specific expert training significantly outperforms generalist approaches trained on diverse DGA families, with F1-score improvements of 9.4% on familiar variants and 30.2% on unseen families. This performance gain indicates that focused expertise develops transferable linguistic patterns rather than memorization of specific family characteristics.

Downloads

Download data is not yet available.

Author Biographies

Reynier Leyva La O, GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional, Mendoza, Argentina, and also with the National Scientific and Technical Research Council (CONICET), Godoy Cruz 2290, C1425FQB, CABA, Argentina

Reynier Leyva La O received the degree in Automation Engineering from the Universidad de Oriente, Santiago de Cuba, Cuba, in 2015. He is currently pursuing the Ph.D. degree in Computer Science at the Universidad Nacional del Centro de la Provincia de Buenos Aires (UNICEN), Argentina, under a research fellowship from the National Scientific and Technical Research Council (CONICET). He is affiliated with GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional (UTN), Argentina. His research interests include the application of artificial intelligence to cybersecurity.

Carlos A. Catania, LABSIN, Facultad de Ingeniería, Universidad Nacional de Cuyo, Mendoza, Argentina, and also with the National Scientific and Technical Research Council (CONICET), Godoy Cruz 2290, C1425FQB, CABA, Argentina

Carlos A. Catania has been a Professor in the Computer Science Department at the Universidad Nacional de Cuyo, Mendoza, Argentina, since 2008. He holds a Ph.D. degree in Computer Science from the Universidad Nacional del Centro de la Provincia de Buenos Aires and a Master’s degree in Data Networking from the Universidad de Mendoza, Argentina. He is currently the Head of the Intelligent Systems Laboratory (LABSIN) at the Faculty of Engineering, Universidad Nacional de Cuyo. He has nearly 20 years of experience in the development of machine learning applications in various areas of industry and science, such as computer security, medicine, animal behavior, navigation systems, and the wine industry. He has authored or coauthored more than 40 publications in national and international conferences and journals.

Rodrigo Gonzalez, GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional, Mendoza, Argentina

Rodrigo Gonzalez received the B.Sc. degree in Electronics in 2005 and the Ph.D. degree in Control Systems Engineering in 2015. He is currently a full-time Adjunct Professor at the Universidad Tecnológica Nacional (UTN), Argentina. His research interests include RAG system optimization and the exploration of hybrid neural networks for domain name generation.

References

Plohmann, D., Yakdan, K., Klatt, M., Bader, J., & Gerhards-Padilla, E. (2016). A comprehensive measurement study of domain generating malware. In 25th USENIX Security Symposium (USENIX Security 16) (pp. 263–278). https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/plohmann

Yang, L., Zhai, J., Liu, W., Ji, X., Bai, H., Liu, G., & Dai, Y. (2019). Detecting word-based algorithmically generated domains using semantic analysis. Symmetry, 11(2), 176. https://doi.org/10.3390/sym11020176

IBM X-Force. (2024). Grandoreiro banking trojan unleashed: X-force observing emerging global campaigns. https://www.ibm.com/think/x-force/grandoreiro-banking-trojan-unleashed

Catania, C., García, S., & Torres, P. (2019). Deep convolutional neural networks for DGA detection. In Computer Science – CACIC 2018 (pp. 327–340). Springer. https://doi.org/10.1007/978-3-030-20787-8_23

Ferreira, E. W. T., Carrijo, G. A., de Oliveira, R., & de Souza Araujo, N. V. (2011). Intrusion detection system with wavelet and neural artificial network approach for networks computers. IEEE Latin America Transactions, 9(5), 832–837. https://doi.org/10.1109/TLA.2011.6030997

Cebere, B., Flueren, J., Sebastián, S., Plohmann, D., & Rossow, C. (2024). Down to earth! Guidelines for DGA-based malware detection. ACM. https://doi.org/10.1145/3678890.3678913

da Silveira Lopes, R., Duarte, J. C., & Goldschmidt, R. R. (2023). False positive identification in intrusion detection using XAI. IEEE Latin America Transactions, 21(6), 745–751. https://doi.org/10.1109/TLA.2023.10172140

Plohmann, D. (2015). DGAArchive – A deep dive into domain generating malware. https://dga-archive.benjaminbp.de

Fu, Y., Yu, L., Hambolu, O., Ozcelik, I., Husain, B., Sun, J., ... & Brooks, R. R. (2017). Stealthy domain generation algorithms. IEEE Transactions on Information Forensics and Security, 12(6), 1430–1443. https://doi.org/10.1109/TIFS.2017.2668361

Yu, B., Pan, J., Hu, J., Nascimento, A., & De Cock, M. (2018). Character level based detection of DGA domain names. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE. https://doi.org/10.1109/IJCNN.2018.8489147

Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324

Highnam, K., Puzio, D., Luo, S., & Jennings, N. R. (2021). Real-time detection of dictionary DGA network traffic using deep learning. SN Computer Science, 2(2), 110. https://doi.org/10.1007/s42979-021-00507-w

Woodbridge, J., Anderson, H. S., Ahuja, A., & Grant, D. (2016). Predicting domain generation algorithms with long short-term memory networks. arXiv:1611.00791. https://doi.org/10.48550/arXiv.1611.00791

Curtin, R. R., Gardner, A. B., Grzonkowski, S., Kleymenov, A., & Mosquera, A. (2019). Detecting DGA domains with recurrent neural networks and side information. In 14th International Conference on Availability, Reliability and Security (pp. 1–10). https://doi.org/10.1145/3339252.3339258

Sayed, M. A., Rahman, A., Kiekintveld, C., & Garcia, S. (2024). Fine-tuning large language models for DGA and DNS exfiltration detection. arXiv:2410.21723. https://doi.org/10.48550/arXiv.2410.21723

La O, R. L., Catania, C. A., & Parlanti, T. (2024). LLMs for domain generation algorithm detection. arXiv:2411.03307. https://doi.org/10.48550/arXiv.2411.03307

Koh, J. J., & Rhodes, B. (2018). Inline detection of domain generation algorithms with context-sensitive word embeddings. In 2018 IEEE International Conference on Big Data (pp. 2966–2971). IEEE. https://doi.org/10.1109/BigData.2018.8622066

Huang, W., Zong, Y., Shi, Z., Wang, L., & Liu, P. (2022). PEPC: A deep parallel convolutional neural network model with pre-trained embeddings for DGA detection. In 2022 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE. https://doi.org/10.1109/IJCNN55064.2022.9892081

Morbidoni, C., Cucchiarelli, A., & Spalazzi, L. (2025). Mixed-embeddings and deep learning ensemble for DGA classification with limited training data. IEEE Access. https://doi.org/10.1109/ACCESS.2025.3565022

Tran, D., Mac, H., Tong, V., Tran, H. A., & Nguyen, L. G. (2018). A LSTM-based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing, 275, 2401–2413. https://doi.org/10.1016/j.neucom.2017.11.018

Tuan, T. A., Long, H. V., & Taniar, D. (2022). On detecting and classifying DGA botnets and their families. Computers & Security, 113, 102549. https://doi.org/10.1016/j.cose.2021.102549

Hossin, M., & Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2), 1. https://doi.org/10.5121/ijdkp.2015.5201

Powers, D. M. (2020). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv:2010.16061. https://doi.org/10.48550/arXiv.2010.16061

NetLab. (2023). 360NetLab DGA dataset. https://data.netlab.360.com

Zago, M., Pérez, M. G., & Pérez, G. M. (2020). UMUDGA: A dataset for profiling algorithmically generated domain names in botnet detection. Data in Brief, 30, 105400. https://doi.org/10.1016/j.dib.2020.105400

Peck, J., Nie, C., Sivaguru, R., Grumer, C., Olumofin, F., Yu, B., Nascimento, A., & De Cock, M. (2019). Charbot: A simple and effective method for evading DGA classifiers. IEEE Access, 7, 91759–91771. https://doi.org/10.1109/ACCESS.2019.2927075

Snyder, P., Taylor, C., & Kanich, C. (2020). The 2020 Tranco list: Improving the Alexa ranking. https://tranco-list.eu

Google DeepMind. (2024). Gemma documentation. https://ai.google.dev/gemma/docs/core

Google. (2025). Gemma 3 4B IT. Hugging Face. https://huggingface.co/google/gemma-3-4b-it

Meta AI. (2024). LLaMA 3.2 3B Instruct. Hugging Face. https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct

Answer.AI. (2024). ModernBERT-base. Hugging Face. https://huggingface.co/answerdotai/ModernBERT-base

Mahdaouy, A. E., Lamsiyah, S., Idrissi, M. J., Alami, H., Yartaoui, Z., & Berrada, I. (2024). DomURLs BERT: Pre-trained BERT-based model for malicious domains and URLs detection and classification. arXiv:2409.09143. https://doi.org/10.48550/arXiv.2409.09143

Schüppen, S., Teubert, D., Herrmann, P., & Meyer, U. (2018). FANCI: Feature-based automated NXDomain classification and intelligence. In 27th USENIX Security Symposium (USENIX Security 18) (pp. 1165–1181). https://www.usenix.org/conference/usenixsecurity18/presentation/schuppen

Google. (2023). Google Colaboratory: A cloud-based collaborative notebook environment. https://colab.research.google.com

La O, R. L. (2024). MoE-word-list-DGA-detection. GitHub. https://github.com/reypapin/MoE-word-list-dga-detection

Cochran, W. G. (1977). Sampling Techniques (3rd ed.). John Wiley & Sons. https://www.academia.edu/download/68847310/COCHRAN_W._Sampling_techniques_compressed.pdf

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3(1), 79–87. https://doi.org/10.1162/neco.1991.3.1.79

Expert Selection for Wordlist-Based DGA Detection

Authors

Keywords:

Abstract

Downloads

Author Biographies

Reynier Leyva La O, GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional, Mendoza, Argentina, and also with the National Scientific and Technical Research Council (CONICET), Godoy Cruz 2290, C1425FQB, CABA, Argentina

Carlos A. Catania, LABSIN, Facultad de Ingeniería, Universidad Nacional de Cuyo, Mendoza, Argentina, and also with the National Scientific and Technical Research Council (CONICET), Godoy Cruz 2290, C1425FQB, CABA, Argentina

Rodrigo Gonzalez, GridTICs, Facultad Regional Mendoza, Universidad Tecnológica Nacional, Mendoza, Argentina

References

Downloads

Published

How to Cite

Issue

Section

Make a Submission

Information