Brazilian scientific productivity from a gender perspective during the Covid-19 pandemic: classification and analysis via machine learning
Keywords:COVID-19, scientific production, gender classification, machine learning
Scientific research activities, in general, have been affected due to the COVID-19 pandemic and the need for distancing. In this paper, an analysis of the impact of COVID-19 on Brazilian scientific research is made, examining the number of complete manuscripts published in the period from 2018 to 2021, considering the researcher's gender. A crawler is implemented to extract the names of Brazilian researchers from the articles, and some machine learning models (SVM, BiLSTM, and CNN) are applied to classify the authors' gender. Some models are able to accurately predict gender in more than 95% of cases. In addition, we verified that in 2021 there was a drop of 37.47% in the publications of articles by Brazilian researchers. The results indicate that there was a greater drop in publications for females in most machine learning models applied, corroborating differences in the distribution of household activities and family care between the two genders.
C. P. Gonc ̧alves, D. S. Ramos, P. S. Rosa, M. H. Balan, B. Bezerra,
M. Cavalieri, and R. F. de Mello, “The impact of covid-19 on the
brazilian power sector: operational, commercial, and regulatory aspects,”
IEEE Latin America Transactions, vol. 20, no. 4, pp. 529–536, 2022.
J. D. Y. Orellana, G. M. d. Cunha, L. Marrero, R. I. Moreira, I. d. C.
Leite, and B. L. Horta, “Excesso de mortes durante a pandemia de
covid-19: subnotificac ̧ ̃ao e desigualdades regionais no brasil,” Cadernos
de Sa ́ude P ́ublica, vol. 37, p. e00259120, 2021.
G. S. G ́oes, F. d. S. Martins, and J. A. S. Nascimento,
“Trabalho remoto no brasil em 2020 sob a pandemia do
covid-19: quem, quantos e onde est ̃ao?” Dispon ́ıvel em:
D. F. T. Arciniegas, M. Amaya, A. P. Carvajal, P. A. Rodriguez-Marin,
L. Duque-Mu ̃noz, and J. D. Martinez-Vargas, “Students’ attention moni-
toring system in learning environments based on artificial intelligence,”
IEEE Latin America Transactions, vol. 20, no. 1, pp. 126–132, 2021.
D. Mancebo, “Trabalho remoto na educac ̧ ̃ao superior brasileira: efeitos
e possibilidades no contexto da pandemia,” Revista USP, no. 127, pp.
L. S. Kafruni, “Alunos de p ́os-graduac ̧ ̃ao e os impactos na produtividade
acadˆemica durante o isolamento social da covid-19,” 2020.
H. Zhao and F. Kamareddine, “Advance gender prediction tool of first
names and its use in analysing gender disparity in computer science
in the uk, malaysia and china,” in 2017 International Conference on
Computational Science and Computational Intelligence (CSCI). IEEE,
, pp. 222–227.
S. J. de Sousa, M. d. O. Santiago, and T. M. R. Dias, “Uma
estrat ́egia para identificac ̧ ̃ao de gˆenero em reposit ́orios de dados
abertos utilizando um modelo de rede neural artificial,” Ciˆencia
da Informac ̧ ̃ao, vol. 48, no. 3, mar. 2020. [Online]. Available:
C. Horhirunkul, S. Vasupongayya, S. Sae-wong, S. Suwanmanee, and
T. Angchuan, “Thai name gender classification using deep learning,” in
25th International Computer Science and Engineering Conference
(ICSEC). IEEE, 2021, pp. 295–300.
K. Zhang, C. Gao, L. Guo, M. Sun, X. Yuan, T. X. Han, Z. Zhao,
and B. Li, “Age group and gender estimation in the wild with deep ror
architecture,” IEEE Access, vol. 5, pp. 22 492–22 503, 2017.
A. Venugopal, O. Yadukrishnan, and R. Nair T., “A svm based gender
classification from children facial images using local binary and non-
binary descriptors,” in 2020 Fourth International Conference on Compu-
ting Methodologies and Communication (ICCMC), 2020, pp. 631–634.
S. Mittal and V. S. Rajput, “Gender and age based census system for
metropolitan cities,” in 2020 8th International Conference on Reliability,
Infocom Technologies and Optimization (Trends and Future Directions)
(ICRITO), 2020, pp. 1094–1097.
A. Kuehlkamp and K. Bowyer, “Predicting gender from iris texture
may be harder than it seems,” in 2019 IEEE Winter Conference on
Applications of Computer Vision (WACV), 2019, pp. 904–912.
P. Vashisth and K. Meehan, “Gender classification using twitter text
data,” in 2020 31st Irish Signals and Systems Conference (ISSC), 2020,
T. Lekamge and T. Fernando, “Finding the gender of personal names
and finding the effect of gana on personal names with long short term
memory,” in 2019 19th International Conference on Advances in ICT
for Emerging Regions (ICTer), vol. 250, 2019, pp. 1–8.
A. A. Septiandri, “Predicting the gender of indonesian names,” 2017.
H. Q. To, K. V. Nguyen, N. L.-T. Nguyen, and A. G.-T. Nguyen,
“Gender prediction based on vietnamese names with machine learning
techniques,” Proceedings of the 4th International Conference on
Natural Language Processing and Information Retrieval, Dec 2020.
[Online]. Available: http://dx.doi.org/10.1145/3443279.3443309
R. C. Rego, V. M. Silva, and V. M. Fernandes, “Predicting gender
by first name using character-level machine learning,” arXiv preprint
A. Tripathi and M. Faruqui, “Gender prediction of indian names,” in
IEEE Technology Students’ Symposium. IEEE, 2011, pp. 137–141.
A. Kowalczyk, Support vector machines succinctly. Syncfusion, Inc.,
, vol. volume.
T. M. Cover, “Geometrical and statistical properties of systems of linear
inequalities with applications in pattern recognition,” IEEE Transactions
on Electronic Computers, vol. EC-14, no. 3, pp. 326–334, 1965.
I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press,
H. Jelodar, Y. Wang, R. Orji, and S. Huang, “Deep sentiment classi-
fication and topic discovery on novel coronavirus or covid-19 online
discussions: Nlp using lstm recurrent neural network approach,” IEEE
Journal of Biomedical and Health Informatics, vol. 24, no. 10, pp. 2733–
M. M. Lopez and J. Kalita, “Deep learning applied to nlp,” arXiv
preprint arXiv:1703.03091, 2017.
X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional net-
works for text classification,” Advances in neural information processing
systems, vol. 28, pp. 649–657, 2015.
Y. Chen, “Convolutional neural network for sentence classification,”
Master’s thesis, University of Waterloo, 2015.
D. M. Hawkins, “The problem of overfitting,” Journal
of Chemical Information and Computer Sciences, vol. 44,
no. 1, pp. 1–12, 2004, pMID: 14741005. [Online]. Available:
A. G. Knuth, F. F. B. de Carvalho, and D. D. Freitas, “Discursos
de instituic ̧ ̃oes de sa ́ude brasileiras sobre atividade f ́ısica no in ́ıcio da
pandemia de covid-19,” Revista Brasileira de Atividade F ́ısica & Sa ́ude,
vol. 25, pp. 1–9, 2020.