Sentiment Analysis Applied to News from the Brazilian Stock Market
Keywords:
Sentiment Analysis, Text Mining, Stock MarketAbstract
Investments in the stock market have grown in Brazil in recent years, especially considering the individual number of investors. According to data from April 2020, the Brazilian stock market reached the historic mark of 2.38 million active investors, and with this scenario, there is an increasing need to study the Brazilian financial market, seeking to better understand its fluctuations. Recent work in the literature indicates that a company’s stock values can be influenced by published news. Therefore, this work contributes to the automatic sentiment analysis applied to news written in Portuguese and related to the Brazilian stock market. For this, we performed three sentiment analysis strategies: two based on machine learning, using the Naive Bayes classifier and a Multilayer Perceptron neural network; and the other based on the lexical approach. Also, we proposed two dictionaries, focused on the financial domain and adapted to Portuguese. Our results show that the Naive Bayes classifier and the Multilayer Perceptron overcomes the best lexical approach. It is worth mentioning that the accuracy achieved by the best lexical approach was with the adapted dictionary proposed here.
Downloads
References
Z. Madhoushi, A. R. Hamdan, and S. Zainudin, “Sentiment analysis techniques in recent works,” in 2015 Science and Information Conference (SAI), pp. 288–291, 2015.
S. P. Kothari and B. J. Warner, The econometrics of event studies, vol. 1, pp. 3–36. Elsevier, 2006.
A. Carosia, G. Coelho, and A. Silva, “The influence of tweets and news on the brazilian stock market through sentiment analysis,” pp. 385–392, 10 2019.
B. Liu, “Sentiment analysis and subjectivity,” in Handbook of Natural Language Processing, Taylor and Francis, 2 ed., 2010.
M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexiconbased methods for sentiment analysis,” Comp. Linguistics, vol. 37,
pp. 267–307, 2011.
P. Turney, “Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews,” in Proc. of the 40th Meeting of the Association for Computational Linguistics, p. 417–424, 2002.
E. Alpaydin, Introduction to Machine Learning. MIT Press; fourth edition, 4 ed., 2020.
D. Shah, H. Isah, and F. Zulkernine, “Predicting the effects of news sentiments on the stock market,” in Proc. of the IEEE International
Conference on Big Data, pp. 4705–4708, 2018.
T. L. Im, P. W. San, C. K. On, R. Alfred, and P. Anthony, “Analysing market sentiment in financial news using lexical approach,” in Proc. of the IEEE Conference on Open Systems (ICOS), pp. 145–149, 2013.
T. Nasukawa and J. Yi, “Sentiment analysis: Capturing favorability using natural language processing,” in Proc of the 2nd International Conference on Knowledge Capture, pp. 70–77, 2003.
F. Benevenuto, F. Ribeiro, and M. Araújo, “Métodos para análise de sentimentos em mídias sociais,” in Proc. of the Brazilian Symposium on Multimedia and the Web (Webmedia), 2015.
M. Hu and B. Liu, “Mining and summarizing customer reviews,” in Proc of the ACM International Conference on Knowledge Discovery & Data Mining (SIGKDD), p. 168–177, 2004.
F. Chiavetta, B. Giosué, and G. Pilato, “A lexicon-based approach for sentiment classification of amazon books reviews in italian language,” in Proc. of the 2nd. Intl. Conference on Web Information Systems and
Technologies, 2016.
A. Esuli and F. Sebastiani, “Sentiwordnet: A publicly available lexical resource for opinion mining,” in Proc. of the 5th Conference on
Language Resources and Evaluation (LREC’06), pp. 417–422, 2006.
M. J. Silva, P. Carvalho, P. Costa, and L. Sarmento, “Automatic expansion of a social judgment lexicon for sentiment analysis,” Tech.
Rep. TR 10-08, University of Lisbon, 2010.
P. Carvalho and M. J. Silva, “SentiLex-PT: Principais características e potencialidades,” Linguística, Informática e Tradução: Mundos que see Cruzam, vol. 7, pp. 425–439, 2015.
T. Loughran and B. Mcdonald, “When is a liability not a liability? textual analysis, dictionaries, and 10-ks,” Journal of Finance, Forthcoming, vol. 66, pp. 35–65, 2010.
A. Hamouda and M. Rohaim, “Reviews classification using sentiwordnet lexicon,” in Proc. of the World Congress on Computer Science and Information Technology, vol. 2, pp. 2090–4517, 2011.
L. D. Freitas and R. Vieira, “Exploring resources for sentiment analysis in portuguese language,” in Proc. of the 2015 Brazilian Conference on Intelligent Systems (BRACIS), p. 152–156, 2015.
B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment classification using machine learning techniques,” in Proc. of the Conference on Empirical Methods in NLP, p. 79–86, 2002.
B. Pang, L. Lee, et al., “Opinion mining and sentiment analysis,” Foundations and Trends® in Information Retrieval, pp. 1–135, 2008.
G. Babin, K. Stanoevska-Slabeva, and P. Kropf, “E-technologies: Transformation in a connected world,” in Proc. of the 5th International Conference on on E-Technologies (MCETECH), Springer-Verlag Berlin
Heidelberg, 2011.
A. Krogh, “What are artificial neural networks?,” Nature Biotechnology, vol. 26, p. 195–197, 2008.
M. Bounabi, K. El Moutaouakil, and K. Satori, “A probabilistic vector representation and neural network for text classification,” in Big Data, Cloud and Applications (Y. Tabii, M. Lazaar, M. Al Achhab, and N. Enneya, eds.), (Cham), pp. 343–355, Springer International Publishing, 2018.
J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock market,” Journal of computational science, pp. 1–8, 2011.
S. Ahmed and A. Danti, “A novel approach for sentimental analysis and opinion mining based on sentiwordnet using web data,” in 2015 International Conference on Trends in Automation, Communications and Computing Technology (I-TACT-15), pp. 1–5, 2015.
I. Kumar, K. Dogra, C. Utreja, and P. Yadav, “A comparative study of supervised machine learning algorithms for stock market trend
prediction,” in Proc. of the 2nd International Conference on Inventive Communication and Computational Technologies, pp. 1003–1007, 2018.
S. K. Srivastava, S. K. Singh, and eJasjit S. Suri, “Healthcare Text Classification System and its Performance Evaluation: A Source of Better Intelligence by Characterizing Healthcare,” Journal of Medical Systems, no. 97, 2018.
R. F. Martins, A. Pereira, and F. Benevenuto, “An approach to sentiment analysis of web applications in portuguese,” Proceedings of the 21st Brazilian Symposium on Multimedia and the Web, 2015.
L. V. Avanço and M. d. G. V. Nunes, “Lexicon-based sentiment analysis for reviews of products in brazilian portuguese,” 2014 Brazilian Conference on Intelligent Systems, 2014.
B. Januário, A. Carosia, G. Coelho, and A. Silva, “Financial news about brazilian companies listed on b3 and source-codes to perform sentiment analysis,” Repositório de Dados de Pesquisa da Unicamp, 2021.
L. N. de Castro and D. G. Ferrari, Introdução a mineração de dados. SARAIVA, 1 ed., 2017.
R. To, K. Izumi, H. Sakaji, and S. Suda, “Lexicon creation for financial, sentiment analysis using network embedding,” Journal of Mathematical Finance, pp. 896–907, 2017.
K. S. Jones, “A statistical interpretation of term specificity and its application in retrieval,” Journal of Documentation, vol. 28, pp. 11–21, 1972.
Z. Hailong, G. Wenyan, and J. Bo, “Machine learning and lexicon based methods for sentiment classification: A survey,” in Proc. of the 11th Web Information System and Application Conference, pp. 262–265, 2014