Brazilian scientific productivity from a gender perspective during the Covid-19 pandemic: classification and analysis via machine learning

Authors

Keywords:

COVID-19, scientific production, gender classification, machine learning

Abstract

Scientific research activities, in general, have been affected due to the COVID-19 pandemic and the need for distancing. In this paper, an analysis of the impact of COVID-19 on Brazilian scientific research is made, examining the number of complete manuscripts published in the period from 2018 to 2021, considering the researcher's gender. A crawler is implemented to extract the names of Brazilian researchers from the articles, and some machine learning models (SVM, BiLSTM, and CNN) are applied to classify the authors' gender. Some models are able to accurately predict gender in more than 95% of cases. In addition, we verified that in 2021 there was a drop of 37.47% in the publications of articles by Brazilian researchers. The results indicate that there was a greater drop in publications for females in most machine learning models applied, corroborating differences in the distribution of household activities and family care between the two genders.

Downloads

Download data is not yet available.

Author Biographies

Rosana Cibely B. Rego, Universidade Federal Rural do Semi-Arido (UFERSA), Mossoró - RN, 59625-900

Professor at the Department of Engineering and Technology at the Universidade Federal Rural do Semi-Árido and PhD in Electrical and Computer Engineering at the Federal University of Rio Grande do Norte (2022), with research in the area of Intelligent Control, Neural Control, Neural Networks, Learning of machine. Certified by Huawei ICT Academy in artificial intelligence. She has knowledge in the programming area (C/C++, Java, Python, Fortran, MatLab/Scilab).

Gabriel da Silva Nascimento, Federal Institute of Education, Science and Technology of Paraíba, João Pessoa - PB, 58015-020

Graduating in Computer Engineering at the Federal Institute of Education, Science and Technology of Paraíba. Technical education in Telecommunications from the Escola Técnica Redentorista in Paraíba. Knowledge in the area of programming (Python, C/C++). Works in the area of optimization software development with python 3. Interest in research in the areas of artificial intelligence, embedded systems and software development.

Davi Emmanuel de Lima Rodrigues, Universidade Federal Rural do Semi-Arido (UFERSA), Mossoró - RN, 59625-900

Graduating in Science and Technology at the Federal Rural University of the Semi-Arid. Working on the research project: PEH30001-2021 - Scientific Productivity from a gender perspective during the Covid-19 pandemic (UFERSA). Knowledge of programming (Python, C, Java, and Javascript). Interest in research in the areas of artificial intelligence, WEB, and mobile development.

Samara Martins Nascimento, Universidade Federal Rural do Semi-Arido (UFERSA), Mossoró - RN, 59625-900

PhD in Computer Science, Federal University of Ceará. Adjunct Professor at the Federal Rural University of the Semi-Arid (UFERSA). She is one of the leaders of the Research Groups Laboratory of Software Innovations (LIS) and Laboratory of Computational Intelligence (CiLab). Her main areas of interest are Database, Big Data, Data Streams, NoSQL Databases, Data Warehouse, Data Management, Systems Analysis, Software Quality and Software Metrics.

Verônica Maria L. Silva, Universidade Federal da Paraiba, PB, 58051-900

Graduated in Computer Engineering from the Federal University of Ceará (2011). Since 2015, she has been a professor at the Federal University of the Semi-Árido Rural (UFERSA) and a PhD in Electrical Engineering from the Federal University of Campina Grande (UFCG), 2019. Her research interests include digital systems, analog-digital converters, analog-to-information converters and embedded systems, artificial intelligence.

References

C. P. Gonc ̧alves, D. S. Ramos, P. S. Rosa, M. H. Balan, B. Bezerra,

M. Cavalieri, and R. F. de Mello, “The impact of covid-19 on the

brazilian power sector: operational, commercial, and regulatory aspects,”

IEEE Latin America Transactions, vol. 20, no. 4, pp. 529–536, 2022.

J. D. Y. Orellana, G. M. d. Cunha, L. Marrero, R. I. Moreira, I. d. C.

Leite, and B. L. Horta, “Excesso de mortes durante a pandemia de

covid-19: subnotificac ̧ ̃ao e desigualdades regionais no brasil,” Cadernos

de Sa ́ude P ́ublica, vol. 37, p. e00259120, 2021.

G. S. G ́oes, F. d. S. Martins, and J. A. S. Nascimento,

“Trabalho remoto no brasil em 2020 sob a pandemia do

covid-19: quem, quantos e onde est ̃ao?” Dispon ́ıvel em:

https://www.ipea.gov.br/cartadeconjuntura/index.php/2021/07/trabalho-

remoto-no-brasil-em-2020-sob-a-pandemia-do-covid-19-quem-quantos-

e-onde-estao/., 2021.

D. F. T. Arciniegas, M. Amaya, A. P. Carvajal, P. A. Rodriguez-Marin,

L. Duque-Mu ̃noz, and J. D. Martinez-Vargas, “Students’ attention moni-

toring system in learning environments based on artificial intelligence,”

IEEE Latin America Transactions, vol. 20, no. 1, pp. 126–132, 2021.

D. Mancebo, “Trabalho remoto na educac ̧ ̃ao superior brasileira: efeitos

e possibilidades no contexto da pandemia,” Revista USP, no. 127, pp.

–116, 2020.

L. S. Kafruni, “Alunos de p ́os-graduac ̧ ̃ao e os impactos na produtividade

acadˆemica durante o isolamento social da covid-19,” 2020.

H. Zhao and F. Kamareddine, “Advance gender prediction tool of first

names and its use in analysing gender disparity in computer science

in the uk, malaysia and china,” in 2017 International Conference on

Computational Science and Computational Intelligence (CSCI). IEEE,

, pp. 222–227.

S. J. de Sousa, M. d. O. Santiago, and T. M. R. Dias, “Uma

estrat ́egia para identificac ̧ ̃ao de gˆenero em reposit ́orios de dados

abertos utilizando um modelo de rede neural artificial,” Ciˆencia

da Informac ̧ ̃ao, vol. 48, no. 3, mar. 2020. [Online]. Available:

http://revista.ibict.br/ciinf/article/view/4908

C. Horhirunkul, S. Vasupongayya, S. Sae-wong, S. Suwanmanee, and

T. Angchuan, “Thai name gender classification using deep learning,” in

25th International Computer Science and Engineering Conference

(ICSEC). IEEE, 2021, pp. 295–300.

K. Zhang, C. Gao, L. Guo, M. Sun, X. Yuan, T. X. Han, Z. Zhao,

and B. Li, “Age group and gender estimation in the wild with deep ror

architecture,” IEEE Access, vol. 5, pp. 22 492–22 503, 2017.

A. Venugopal, O. Yadukrishnan, and R. Nair T., “A svm based gender

classification from children facial images using local binary and non-

binary descriptors,” in 2020 Fourth International Conference on Compu-

ting Methodologies and Communication (ICCMC), 2020, pp. 631–634.

S. Mittal and V. S. Rajput, “Gender and age based census system for

metropolitan cities,” in 2020 8th International Conference on Reliability,

Infocom Technologies and Optimization (Trends and Future Directions)

(ICRITO), 2020, pp. 1094–1097.

A. Kuehlkamp and K. Bowyer, “Predicting gender from iris texture

may be harder than it seems,” in 2019 IEEE Winter Conference on

Applications of Computer Vision (WACV), 2019, pp. 904–912.

P. Vashisth and K. Meehan, “Gender classification using twitter text

data,” in 2020 31st Irish Signals and Systems Conference (ISSC), 2020,

pp. 1–6.

T. Lekamge and T. Fernando, “Finding the gender of personal names

and finding the effect of gana on personal names with long short term

memory,” in 2019 19th International Conference on Advances in ICT

for Emerging Regions (ICTer), vol. 250, 2019, pp. 1–8.

A. A. Septiandri, “Predicting the gender of indonesian names,” 2017.

H. Q. To, K. V. Nguyen, N. L.-T. Nguyen, and A. G.-T. Nguyen,

“Gender prediction based on vietnamese names with machine learning

techniques,” Proceedings of the 4th International Conference on

Natural Language Processing and Information Retrieval, Dec 2020.

[Online]. Available: http://dx.doi.org/10.1145/3443279.3443309

R. C. Rego, V. M. Silva, and V. M. Fernandes, “Predicting gender

by first name using character-level machine learning,” arXiv preprint

arXiv:2106.10156, 2021.

A. Tripathi and M. Faruqui, “Gender prediction of indian names,” in

IEEE Technology Students’ Symposium. IEEE, 2011, pp. 137–141.

A. Kowalczyk, Support vector machines succinctly. Syncfusion, Inc.,

, vol. volume.

T. M. Cover, “Geometrical and statistical properties of systems of linear

inequalities with applications in pattern recognition,” IEEE Transactions

on Electronic Computers, vol. EC-14, no. 3, pp. 326–334, 1965.

I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press,

H. Jelodar, Y. Wang, R. Orji, and S. Huang, “Deep sentiment classi-

fication and topic discovery on novel coronavirus or covid-19 online

discussions: Nlp using lstm recurrent neural network approach,” IEEE

Journal of Biomedical and Health Informatics, vol. 24, no. 10, pp. 2733–

, 2020.

M. M. Lopez and J. Kalita, “Deep learning applied to nlp,” arXiv

preprint arXiv:1703.03091, 2017.

X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional net-

works for text classification,” Advances in neural information processing

systems, vol. 28, pp. 649–657, 2015.

Y. Chen, “Convolutional neural network for sentence classification,”

Master’s thesis, University of Waterloo, 2015.

D. M. Hawkins, “The problem of overfitting,” Journal

of Chemical Information and Computer Sciences, vol. 44,

no. 1, pp. 1–12, 2004, pMID: 14741005. [Online]. Available:

https://doi.org/10.1021/ci0342472

A. G. Knuth, F. F. B. de Carvalho, and D. D. Freitas, “Discursos

de instituic ̧ ̃oes de sa ́ude brasileiras sobre atividade f ́ısica no in ́ıcio da

pandemia de covid-19,” Revista Brasileira de Atividade F ́ısica & Sa ́ude,

vol. 25, pp. 1–9, 2020.

Published

2022-09-04

How to Cite

Rego, R. C. B., Nascimento, G. da S., Rodrigues, D. E. . de L., Nascimento, S. M., & Silva, V. M. L. (2022). Brazilian scientific productivity from a gender perspective during the Covid-19 pandemic: classification and analysis via machine learning. IEEE Latin America Transactions, 21(2), 302–309. Retrieved from https://latamt.ieeer9.org/index.php/transactions/article/view/6910