Network Optimization based on Genetic Algorithm for High-Level Data Classification
Keywords:Complex Networks, Genetic Algorithms, Network Optimization, Graph Optimization, Particle swarm optimization, High Data Classification
High-level data classification techniques are capable of considering not only physical aspects of the data, such as space, distance, proximity, distribution, but can also consider their functional, topological and structural aspects. High-level techniques are commonly defined in two major steps: the construction of a network from the feature vector data and the uncovering of its underlying patterns using complex networks properties. In the network construction step, heuristics based on k-nearest neighbors strategies have been widely adopted, while several complex network measures (e.g. PageRank) have been modeled to learn high-level patterns of the input data. As both steps are directly related, i.e., the network configuration impacts directly the results obtained by the classifier, in this paper we develop a genetic algorithm (GA) to optimize the network construction step. To be specific, we hypothesize that the salient features of GAs, such as their robust search mechanism and binary representation, may provide a more powerful network representation in the context of the high-level classification based on importance characterization. In summary, extensive experiments with real data sets demonstrate that the networks provided by our GA strategy achieved higher predictive accuracy than those of a widely adopted method based on the nearest neighbors heuristic and competitive results against state-of-the-art ones.
A. Barrat, M. Barthelemy, R. Pastor-Satorras, and A. Vespignani, “The architecture of complex weighted networks,” Proceedings of the national academy of sciences, vol. 101, no. 11, pp. 3747–3752, 2004.
M. G. Carneiro, Redes complexas para classificação de dados via conformidade de padrão, caracterização de importância e otimização estrutural. PhD thesis, Universidade de São Paulo, 2017.
A.-L. Barabási and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
R. Van Der Hofstad, Random graphs and complex networks, vol. 43. Cambridge university press, 2017.
M. Drobyshevskiy and D. Turdakov, “Random graph modeling: A survey of the concepts,” ACM Computing Surveys (CSUR), vol. 52, no. 6, pp. 1– 36, 2019.
L. M. Freitas and M. G. Carneiro, “Community detection to invariant pattern clustering in images,” in Brazilian Conference on Intelligent Systems, pp. 610–615, IEEE, 2019.
Y. Liu, J. Lee, M. Park, S. Kim, E. Yang, S. J. Hwang, and Y. Yang, “Learning to propagate labels: Transductive propagation network for few-shot learning,” arXiv preprint arXiv:1805.10002, 2018.
J. E. Van Engelen and H. H. Hoos, “A survey on semi-supervised learning,” Machine Learning, vol. 109, no. 2, pp. 373–440, 2020.
V. H. Resende and M. G. Carneiro, “Analysis of complex network measures for multi-label classification,” International Journal on Artificial Intelligence Tools, vol. 30, no. 04, p. 2150023, 2021.
M. G. Carneiro, B. C. Gama, and O. S. Ribeiro, “Complex network measures for data classification,” in International Joint Conference on Neural Networks (IJCNN), pp. 1–8, IEEE, 2021.
T. C. Silva and L. Zhao, “Network-based high level data classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 6, pp. 954–970, 2012.
M. G. Carneiro and L. Zhao, “Organizational data classification based on theimportanceconcept of complex networks,” IEEE transactions on neural networks and learning systems, vol. 29, no. 8, pp. 3361–3373, 2018.
M. G. Carneiro, R. Cheng, L. Zhao, and Y. Jin, “Particle swarm optimization for network-based data classification,” Neural Networks, vol. 110, pp. 243–255, 2019.
M. G. Carneiro and L. Zhao, “Analysis of graph construction methods in supervised data classification,” in 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pp. 390–395, IEEE, 2018.
L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.,” tech. rep., Stanford InfoLab, 1999.
S. Katoch, S. S. Chauhan, and V. Kumar, “A review on genetic algorithm: past, present, and future,” Multimedia Tools and Applications, vol. 80, no. 5, pp. 8091–8126, 2021.
S. Mirjalili, “Genetic algorithm,” in Evolutionary algorithms and neural networks, pp. 43–55, Springer, 2019.
T. Guo, K. Yu, M. Aloqaily, and S. Wan, “Constructing a prior-dependent graph for data clustering and dimension reduction in the edge of aiot,” Future Generation Computer Systems, vol. 128, pp. 381–394, 2022.
Y. Zhang, S. Ding, L. Wang, Y. Wang, and L. Ding, “Chameleon algorithm based on mutual k-nearest neighbors,” Applied Intelligence, vol. 51, no. 4, pp. 2031–2044, 2021.
M. G. Carneiro, J. L. G. Rosa, A. A. Lopes, and L. Zhao, “Networkbased data classification: combining k-associated optimal graphs and high-level prediction,” Journal of the Brazilian Computer Society, vol. 20, no. 1, pp. 1–14, 2014.
S. L. Yadav and A. Sohal, “Comparative study of different selection techniques in genetic algorithm,” International Journal of Engineering, Science and Mathematics, vol. 6, no. 3, pp. 174–180, 2017.
P. Kora and P. Yadlapalli, “Crossover operators in genetic algorithms: A review,” International Journal of Computer Applications, vol. 162, no. 10, 2017.
A. Asuncion and D. Newman, “Uci machine learning repository,” 2007.
J. Demšar, “Statistical comparisons of classifiers over multiple data sets,” Journal of Machine learning research, vol. 7, no. Jan, pp. 1–30, 2006.