Towards a Set of Heuristics for Evaluating Chatbots
Keywords:Chatbots, Evaluation, Heuristics, Usability
Chatbots are artificial intelligence tools that interact with people in different contexts. A chatbot can be useful to streamline daily processes, serve customers 24 hours a day, provide information about classes, among other things. The appearance of new development technologies has made creating a chatbot an increasingly fast and straightforward process, bringing this kind of applications to people who had never considered using them before. However, this speed in development can lead to specific problems, many of them caused by the lack of usability evaluations. Heuristic usability evaluations are user interface review processes carried out by experts and are an essential part of any assessment process. To date, there are no heuristics to evaluate the usability of chatbots. Therefore, this work proposes five usability heuristics in chatbots that come from the experience developing this type of applications, as well as from a broad review of state of the art. The set of heuristics was tested using a case study with the help of five experts, who evaluated an education-oriented chatbot. The results revealed that, although the proposed heuristics need refinement, they are an excellent first step in broadening the horizon of usability evaluations in chatbots.
B. A. Shawar and E. Atwell, “Chatbots: are they really useful?” in Ldv forum, vol. 22, no. 1, 2007, pp. 29–49.
I. Intelligence, “Chatbot market in 2021: Stats, trends, and companies in the growing ai chatbot industry,” Feb 2021. [Online]. Available: https://www.businessinsider.com/ chatbot-market-stats-trends
R. Ravi, “Intelligent chatbot for easy web-analytics insights,” in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Sep. 2018, pp. 2193–2195.
A. V. Dian Sano, T. Daud Imanuel, M. Intanadias Calista, H. Nindito, and A. Raharto Condrobimo, “The application of agnes algorithm to optimize knowledge base for tourism chatbot,” in 2018 International Conference on Information Management and Technology (ICIMTech), Sep. 2018, pp. 65– 68.
B. R. Ranoliya, N. Raghuwanshi, and S. Singh, “Chatbot for university related faqs,” in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Sep. 2017, pp. 1525–1530.
H. Agus Santoso, N. Anisa Sri Winarsih, E. Mulyanto, G. Wilu- jeng saraswati, S. Enggar Sukmana, S. Rustad, M. Syaifur Rohman, A. Nugraha, and F. Firdausillah, “Dinus intelligent assistance (dina) chatbot for university admission services,” in 2018 rnational Seminar on Application for Technology of Information and Communication, Sep. 2018, pp. 417–423.
A. Argal, S. Gupta, A. Modi, P. Pandey, S. Shim, and C. Choo, “Intelligent travel chatbot for predictive recommendation in echo platform,” in 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Jan 2018, pp. 176–183.
R. De’, N. Pandey, and A. Pal, “Impact of digital surge during covid-19 pandemic: A viewpoint on research and practice,” International Journal of Information Management, 02171, 2020. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0268401220309622
A. S. Miner, L. Laranjo, and A. B. Kocaballi, “Chatbots in the fight against the covid-19 pandemic,” npj Digital Medicine, vol. 3, no. 1, p. 65, May 2020. [Online]. Available: https://doi.org/10.1038/s41746-020-0280-0
K. Hao, “The pandemic is emptying call centers. ai chatbots are swooping in,” May 2020. [Online]. Available: https://www.technologyreview.com/2020/05/14/1001716/chatbots-take-call-center-jobs-during-coronavirus-pandemic/
P. Vergadia, “How can chatbots help during global pandemic (covid-19)?” Apr 2020. [Online]. https://medium.com/google-cloud/how-can-chatbots-help-during-global-pandemic-covid-19-9c1a4428d8c2
C. J. Luo, V. Y. L. Wong, and D. E. Gonda, “Code free chatbot development: An easy way to jumpstart your chatbot!” in Proceedings of the Seventh ACM Conference on Learning @ Scale, ser. L@S ’20. New York, NY, USA: Association for Computing Machinery, 2020, p. 233–235. [Online]. Available: https://doi.org/10.1145/3386527.3405932
J. Ferreira, J. Noble, and R. Biddle, “Agile development iterations and ui design,” in Agile 2007 (AGILE 2007), 2007, pp. 50–58.
D. Gandasari, “Evaluation of online learning with digital com- munication media during the covid 19 pandemic,” Journal of the Social Sciences, vol. 48, no. 3, 2020.
C. M. Barnum, “1 - establishing the essentials,” in Usability Testing Essentials, C. M. Barnum, Ed. Boston: Morgan Kaufmann, 2011, pp. 9 – 23. [Online]. Available: http://www. sciencedirect.com/science/article/pii/B9780123750921000015
H. Petrie and N. Bevan, “The evaluation of accessibility, usabil- ity, and user experience.” The universal access handbook, vol. 1, pp. 1–16, 2009.
E. Law, V. Roto, A. P. Vermeeren, J. Kort, and M. Hassenzahl, “Towards a shared definition of user experience,” in CHI ’08 Extended Abstracts on Human Factors in Computing Systems, ser. CHI EA ’08. Florence, Italy: Association for Computing Machinery, 2008, p. 2395–2398. [Online]. Available: https://doi.org/10.1145/1358628.1358693
E. L.-C. Law, V. Roto, M. Hassenzahl, A. P. Vermeeren, and J. Kort, “Understanding, scoping and defining user experience: A survey approach,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’09. Boston, MA, USA: Association for Computing Machinery, 2009, p. 719–728. [Online]. Available: https://doi.org/10.1145/1518701.1518813
P. Morville, “Experience design unplugged,” in ACM SIGGRAPH 2005 Web Program, ser. SIGGRAPH ’05. Los Angeles, California: Association for Computing Machinery, 2005, p. 10–es. [Online]. Available: https://doi.org/10.1145/ 1187335.1187347
O. Parlangeli, E. Marchigiani, and S. Bagnara, “Multimedia systems in distance education: effects of usability on learning,” Interacting with Computers, vol. 12, no. 1, pp. 37–49, 1999. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S095354389800054X
J. Sandars, “The importance of usability testing to allow e-learning to reach its potential for medical education,” Education for Primary Care, vol. 21, no. 1, pp. 6–8, 2010. [Online]. Available: https://doi.org/10.1080/14739879.2010.11493869
L. A. Ensina, H. D. Lee, W. S. R. Takaki, N. A. R. Maciejewski, N. Spolaôr, and F. C. Wu, “Heuristics-based responsiveness evaluation of a telemedicine computational web system,” IEEE Latin America Transactions, vol. 17, no. 03, pp. 444–452, 2019.
A. R. Svaigen and L. A. F. Martimiano, “Netanimations mobile app: Improvement of accessibility and usability to computer net- work learning animations,” IEEE Latin America Transactions, vol. 16, no. 1, pp. 272–278, 2018.
C. M. Barnum, “3 - big u and little u usability,” in Usability Testing Essentials, C. M. Barnum, Ed. Boston: Morgan Kaufmann, 2011, pp. 53 – 81. [Online]. Available: http://www. sciencedirect.com/science/article/pii/B9780123750921000039
R. H. R. Harper, “The role of hci in the age of ai,” International Journal of Human–Computer Interaction, vol. 35, no. 15, pp. 1331–1344, 2019. [Online]. Available: https://doi.org/10.1080/10447318.2019.1631527
M. Ienca and E. Vayena, “On the responsible use of digital data to tackle the covid-19 pandemic,” Nature Medicine, vol. 26, no. 4, pp. 463–464, Apr 2020. [Online]. Available: https://doi.org/10.1038/s41591-020-0832-5
S. Valtolina, B. R. Barricelli, and S. D. Gaetano, “Communicability of traditional interfaces vs chatbots in healthcare and smart home domains,” Behaviour & Information Technology, vol. 39, no. 1, pp. 108–132, 2020. [Online]. Available: https://doi.org/10.1080/0144929X.2019.1637025
A. B. Kocaballi, L. Laranjo, and E. Coiera, “Understanding and Measuring User Experience in Conversational Interfaces,” Interacting with Computers, vol. 31, no. 2, pp. 192–207, 05 2019. [Online]. Available: https://doi.org/10.1093/iwc/iwz015
R. Ren, J. W. Castro, S. T. Acuña, and J. de Lara , “Evaluation techniques for chatbot usability: A systematic mapping study,” International Journal of Software Engineering and Knowledge Engineering, vol. 29, no. 11n12, pp. 1673–1702, 2019. [Online]. Available: https://doi.org/10.1142/S0218194019400163
H. Ding, N. Ranade, and A. Cata, “Boundary of content ecology: Chatbots, user experience, heuristics, and pedagogy,” in Proceedings of the 37th ACM International Conference on the Design of Communication, ser. SIGDOC ’19. Portland, Oregon: Association for Computing Machinery, 2019. [Online]. Available: https://doi.org/10.1145/3328020.3353931
S. Holmes, A. Moorhead, R. Bond, H. Zheng, V. Coates, and M. Mctear, Usability Testing of a Healthcare Chatbot: Can We Use Conventional Methods to Assess Conversational User Interfaces? New York, NY, USA: Association for Computing Machinery, 2019, p. 207–214. [Online]. Available: https://doi.org/10.1145/3335082.3335094
J. Yin, T.-T. Goh, B. Yang, and Y. Xiaobin, “Conversation technology with micro-learning: The impact of chatbot-based learning on students’ learning motivation and performance,” Journal of Educational Computing Research, 2020. [Online]. Available: https://doi.org/10.1177/0735633120952067
M. S. Rafael, T. B. L. María, F. U. Antonio, and D. L. F. M. Hanns, “Support to the learning of the chilean tax system using artificial intelligence through a chatbot,” in 2019 38th Inter- national Conference of the Chilean Computer Science Society (SCCC), 2019, pp. 1–8.
D. Quiñones, C. Rusu, S. Roncagliolo, V. Rusu, and C. A. Collazos, “Developing usability heuristics: A formal or informal process?” IEEE Latin America Transactions, vol. 14, no. 7, pp. 3400–3409, 2016.
J. Weizenbaum, “Eliza—a computer program for the study of natural language communication between man and machine,” Commun. ACM, vol. 9, no. 1, p. 36–45, Jan. 1966. [Online]. Available: https://doi.org/10.1145/365153.365168
J. Sedoc, D. Ippolito, A. Kirubarajan, J. Thirani, L. Ungar, and C. Callison-Burch, “ChatEval: A tool for chatbot evaluation,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 60–65. [Online]. Available: https://www.aclweb.org/anthology/N19-4011
K. Kuksenok and A. Martyniv, “Evaluation and improvement of chatbot text classification data quality using plausible negative examples,” 2019.
C. Segura, À. Palau, J. Luque, M. R. Costa-Jussà, and R. E. Banchs, “Chatbol, a chatbot for the spanish “la liga”,” in 9th International Workshop on Spoken Dialogue System Technology, L. F. D’Haro, R. E. Banchs, and H. Li, Eds. Singapore: Springer Singapore, 2019, pp. 319–330.
M. Qiu, F.-L. Li, S. Wang, X. Gao, Y. Chen, W. Zhao, H. Chen, J. Huang, and W. Chu, “AliMe chat: A sequence to sequence and rerank based chatbot engine,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 498–503. [Online]. Available: https://www.aclweb.org/anthology/P17-2079
H. Kazi, B. Chowdhry, and Z. Memon, “Medchatbot: An umls based chatbot for medical students,” International Journal of Computer Applications, vol. 55, no. 17, 2012.
G. C. Guerino and N. M. C. Valentim, “Usability and user experience evaluation of conversational systems: A systematic mapping study,” in Proceedings of the 34th Brazilian Symposium on Software Engineering, ser. SBES ’20. Natal, Brazil: Association for Computing Machinery, 2020, p. 427–436. [Online]. Available: https://doi.org/10.1145/3422392.3422421
N. Bevan, J. Carter, J. Earthy, T. Geis, and S. Harker, “New iso standards for usability, usability reports and usability measures,” in Proceedings, Part I, of the 18th International Conference on Human-Computer Interaction. Theory, Design, Development and Practice - Volume 9731. Berlin, Heidelberg: Springer-Verlag, 2016, p. 268–278. [Online]. Available: https://doi.org/10.1007/978-3-319-39510-4_25
L. Ciechanowski, A. Przegalinska, M. Magnuski, and P. Gloor, “In the shades of the uncanny valley: An experimental study of human–chatbot interaction,” Future Generation Computer Systems, vol. 92, pp. 539–548, 2019. [Online]. Available: http://www.sciencedirect.com/ science/article/pii/S0167739X17312268
J. Nielsen and R. Molich, “Heuristic evaluation of user interfaces,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’90. Seattle, Washington, USA: Association for Computing Machinery, 1990, p. 249–256. [Online]. Available: https://doi.org/10.1145/97243. 97281
J. Nielsen, “10 usability heuristics for user interface design,” Nov 2020. [Online]. Available: https://www.nngroup.com/articles/ten-usability-heuristics/
P. Smutny and P. Schreiberova, “Chatbots for learning: A review of educational chatbots for the facebook messenger,” Computers & Education, vol. 151, p. 103862, 2020. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/S0360131520300622
J. Nielsen, “Finding usability problems through heuristic evaluation,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’92. Monterey, California, USA: Association for Computing Machinery, 1992, p. 373–380. [Online]. Available: https://doi.org/10.1145/142750. 142834
J. Zhang, T. R. Johnson, V. L. Patel, D. L. Paige, and T. Ku- bose, “Using usability heuristics to evaluate patient safety of medical devices,” Journal of Biomedical Informatics, vol. 36, no. 1, pp. 23 – 30, 2003.
N. K. Chuan, A. Sivaji, and W. F. W. Ahmad, “Usability heuristics for heuristic evaluation of gestural interaction in hci,” in Design, User Experience, and Usability: Design Discourse, A. Marcus, Ed. Cham: Springer International Publishing, 2015, pp. 138–148.
B. A. Kumar, M. S. Goundar, and S. S. Chand, “A framework for heuristic evaluation of mobile learning applications,” Education and Information Technologies, Jan 2020. [Online]. Available: https://doi.org/10.1007/s10639-020-10112-8
S. Mendoza, M. Hernández-León, L. M. Sánchez-Adame, J. Rodríguez, D. Decouchant, and A. Meneses-Viveros, “Supporting student-teacher interaction through a chatbot,” in Learning and Collaboration Technologies. Human and Technology Ecosys- tems, P. Zaphiris and A. Ioannou, Eds. Cham: Springer International Publishing, 2020, pp. 93–107.
U. G. S. Administration, “Heuristic evaluations and expert reviews,” Oct 2013. [Online]. Available: https://www.usability.gov/how-to-and-tools/methods/heuristic-evaluation.html