Text representations for lyric-based identification of musical subgenres

Authors

Keywords:

Music Classification, Text Representations, Bag-of-words, Word Embeddings, Neural Networks, Deep Learning

Abstract

The advancement of techniques and computational tools for data mining has been boosting the music market with applications focused on user experience. These techniques explore musical data looking for patterns and trends that can guide business strategies. One of the key steps in these applications is the vector representation of the original text. This work approaches textual representation techniques applied to the problem of classifying musical sub-genres, a gap in the literature in musical information retrieval, whose complexity lies in the difficult identification of the separation boundary between the sub-classes of the same genre since both carry several features in common. For this, exhaustive experiments were carried out aiming to find the best combination between classifier and textual representation models. The results showed enriched Bag-of-Words (BoW) with the SVM and Logistic Regression algorithms obtained better results than embeddings models and deep neural networks. The conclusions obtained could guide future studies for classifying texts whose separability surfaces are subtle and challenging.

Downloads

Download data is not yet available.

Author Biographies

Fabrício Almeida do Carmo, State University of Maranhão

Received the B. S. degree in Computer Science from Federal University of Western Pará (UFOPA), Pará, Brazil, in 2018. He is currently pursuing Master's degree in Computer and Systems Engineering from State University of Maranhão (UEMA), Maranhão, Brazil. His current research interests include machine learning, deep learning and natural language processing.

 

Jorge Luiz Figueira da Silva Junior, Federal University of Western Pará

Is graduated in Computer Science from the Institute of Engineering and Geosciences (IEG) of the Federal University of Western Pará (UFOPA), in Santarém/PA (2021). Since 2018 he has been working on Text Mining projects. More specifically, he has worked in the areas of data collection, pre-processing, topic extraction and supervised learning for text classification.

 

Rafael Geraldeli Rossi, iFood Brazil

Received the B.S degree in Information Systems and Ms and PhD degrees in Computer Science and Computational Mathematics from University of Sao Paulo, Brazil. He is currently an Data Scientist at the biggest FoodTech Company in Brazil (iFood). His research interests include machine learning, text mining and graph-based methods.

Fábio Manoel França Lobato, Federal University of Western Pará

Is a Lecturer of Computing at the Federal University of Western Pará and leads the Applied Computing Research Group. He has a Productivity Grant in Technological Development and Innovative Extension from the National Council for Scientific and Technological Development (CNPq). His research interests are data science, social media analytics, and electronic markets.

References

R. H. Hariri, E. M. Fredericks, and K. M. Bowers, “Uncertainty in big data analytics: survey, opportunities, and challenges,” Journal of Big Data, vol. 6, p. 44, Jun 2019.

J. Sun and S. K. Gupta, “Variational fuzzy neural network algorithm for music intelligence marketing strategy optimization,” Intell. Neuroscience, vol. 2022, p. 10, jan 2022.

Z. Fu, G. Lu, K. M. Ting, and D. Zhang, “A survey of audio-based music classification and annotation,” IEEE Transactions on Multimedia, vol. 13, pp. 303–319, April 2011.

J. P. Bello, P. Grosche, M. Müller, and R. Weiss, “Content-based methods for knowledge discovery in music,” in Springer Handbooks, Springer Handbooks, pp. 823–840, Springer, 2018.

R. de Araújo Lima, R. C. C. de Sousa, H. Lopes, and S. D. J. Barbosa, “Brazilian lyrics-based music genre classification using a blstm network,” in Artificial Intelligence and Soft Computing (L. Rutkowski, R. Scherer, M. Korytkowski, W. Pedrycz, R. Tadeusiewicz, and J. M. Zurada, eds.), (Cham), pp. 525–534, Springer International Publishing, 2020.

R. Akella and T.-S. Moh, “Mood classification with lyrics and convnets,” in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 511–514, 2019.

E. Paiva, A. Paim, and N. Ebecken, “Convolutional neural networks and long short-term memory networks for textual classification of information access requests,” IEEE Latin America Transactions, vol. 19, p. 826–833, Jun. 2021.

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, pp. 3111–3119, 2013.

A. Caparrini, J. Arroyo, L. Pérez-Molina, and J. Sánchez-Hernández, “Automatic subgenre classification in an electronic dance music taxonomy,” Journal of New Music Research, vol. 49, no. 3, pp. 269–284, 2020.

J. S. Junior, R. Rossi, and F. Lobato, “Uma abordagem baseada em letras para a descoberta de conhecimento da música brasileira: o sertanejo como um estudo de caso,” in Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, (Porto Alegre, RS, Brasil), pp. 949–960, SBC, 2019.

A. Patel and A. Arasanipalai, Applied Natural Language Processing in the Enterprise. O’Reilly Media, 2021.

R. Patil, N. Bowman, and J. Wood, “Analysis of different types of word representations and neural networks on sentiment classification tasks,” in 2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 0473–0478, Oct 2021.

W. Ling, C. Dyer, A. Black, and I. Trancoso, “Two/too simple adaptations of word2vec for syntax problems,” in Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p. 1299–1304, Association for Computational Linguistics, 2015.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the association for computational linguistics, vol. 5, pp. 135–146, 2017.

J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532– 1543, 2014.

K. Choi, G. Fazekas, M. Sandler, and K. Cho, “Convolutional recurrent neural networks for music classification,” in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2392–2396, 2017.

Letras, “Letras platform,” 2022 [Online]. Avaliable: https://www.letras. mus.br/mais-acessadas/.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

S. Bird, E. Klein, and E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit. "O’Reilly Media, Inc.", 2009.

R. Reh˚u ˇ ˇrek and P. Sojka, “Software Framework for Topic Modelling with Large Corpora,” in Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, (Valletta, Malta), pp. 45–50, ELRA, May 2010.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for largescale machine learning,” in Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI’16, (USA), p. 265–283, USENIX Association, 2016.

W. Etaiwi and G. Naymat, “The impact of applying different preprocessing steps on review spam detection,” Procedia Computer Science, vol. 113, pp. 273–279, 2017. The 8th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2017) / The 7th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH2017).

M. Pita and G. L. Pappa, “Strategies for short text representation in the word vector space,” in 7th Brazilian Conference on Intelligent Systems (BRACIS), pp. 266–271, IEEE, 2018.

Published

2023-06-20

How to Cite

Almeida do Carmo, F., Figueira da Silva Junior, J. L., Geraldeli Rossi, R., & França Lobato, F. M. (2023). Text representations for lyric-based identification of musical subgenres. IEEE Latin America Transactions, 21(6), 737–744. Retrieved from https://latamt.ieeer9.org/index.php/transactions/article/view/7664