Towards the Categorization of Brazilian Financial Market Headlines
Keywords:Machine Learning, Text Categorization, Financial Market
Financial market news portals are valuable sources of information as they hold great power over investors' decision-making processes. Due to the vast amount of text data produced by news portals, several studies have been conducted to comprehend the behavioral variations of texts and automate the categorization of short texts. However, extracting useful information that influences investors' decision-making process is not a trivial task, given that news portals use a heterogeneous and specific language for each content produced, making it challenging to generate a standard document format. This work proposes GOOSE, a solution for the cateGOrizatiOn of Short texts derived from multiple sources of information, to portray the financial market's current situation. To this end, GOOSE is based on Bidirectional Long Short-Term Memory (Bi-LSTM) and GloVe Embeddings to increase reliability in the short texts classification process. That way, GOOSE obtains data from news portals, which, once combined with a word embedding mechanism, are used as input for the Bi-LSTM to classify financial market news texts. The results obtained showed that GOOSE's efficiency in categorizing texts had an accuracy of 84% but also demonstrated the feasibility of its use in the extraction of information from financial market news portals.
Sahar Sohangir, Dingding Wang, Anna Pomeranets, & T. Khoshgoftaar (2018). Big Data: Deep Learning for Financial Sentiment Analysis. Journal of Big Data, 5, 1–25.
Enamoto, L., Weigang, L., & Rocha Filho, G. (2021). Generic Framework for Multilingual Short Text Categorization Using Convolutional Neural Network. Multimedia Tools and Applications, 1–16.
Sebastiani, F., & Ricerche, C. (2002). Machine Learning in Automated Text Categorization. ACM Computing Surveys, 34, 1–47.
Aggarwal, C. (2018). Machine Learning for Text. Springer Publishing Company, Incorporated.
Onur Can Sert, Salih Doruk Şahin, Tansel Özyer, & Reda Alhajj (2020). Analysis and Prediction in Sparse and High Dimensional Text Data: The Case of Dow Jones Stock Market. Physica A: Statistical Mechanics and its Applications, 545, 123752.
Carosia, A., Coelho, G., & Silva, A. (2019). The Influence of Tweets and News on the Brazilian Stock Market Through Sentiment Analysis. (pp. 385-392).
Nti, I., Adekoya, A., & Weyori, B. (2020). Predicting Stock Market Price Movement Using Sentiment Analysis: Evidence From Ghana. Applied Computer Systems, 25, 33-42.
Muhammad Abubakr Naeem, Saqib Farid, Balli Faruk, & Syed Jawad Hussain Shahzad (2020). Can Happiness Predict Future Volatility in Stock Markets?. Research in International Business and Finance, 54, 101298.
Johnson D. Kinyua, Charles Mutigwe, Daniel J. Cushing, & Michael Poggi (2021). An Analysis of the Impact of President Trump’s Tweets on the DJIA and S&P 500 Using Machine Learning and Sentiment Analysis. Journal of Behavioral and Experimental Finance, 29, 100447.
Althelaya, K., El-Alfy, E.S., & Mohammed, S. (2018). Stock market forecast using multivariate analysis with bidirectional and stacked (LSTM, GRU). In 2018 21st Saudi Computer Society National Computer Conference (NCC) (pp. 1–7).
W. Zhao, G. Zhang, G. Yuan, J. Liu, H. Shan, & S. Zhang (2020). The Study on the Text Classification for Financial News Based on Partial Information. IEEE Access, 8, 100426-100437.
Hutto, C., & Gilbert, E. (2014). VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. In Book title is required!. The AAAI Press.
Jeffrey Pennington, Richard Socher, & Christopher D. Manning (2014). GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543).
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Comput., 9(8), 1735–1780.
Zhang, M. (2015). Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation (pp. 73–78).
Bergstra, J., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for Hyper-Parameter Optimization. In Advances in neural information processing systems (pp. 2546–2554).