A Dental Chatbot Based on IndoBERT with Next Sentence Prediction and Intent Classification

Nadhief Athallah Isya; Rasim Rasim; Ani Anisyah

doi:10.47709/brilliance.v5i2.6620

A Dental Chatbot Based on IndoBERT with Next Sentence Prediction and Intent Classification

Authors

Nadhief Athallah Isya Universitas Pendidikan Indonesia, Indonesia
Rasim Universitas Pendidikan Indonesia, Indonesia
Ani Anisyah Universitas Pendidikan Indonesia, Indonesia

DOI:

https://doi.org/10.47709/brilliance.v5i2.6620

Keywords:

Dental Health Chatbot, IndoBERT, Intent Classification, Natural Language Processing, Next Sentence Prediction

Abstract

Low public awareness regarding the importance of dental health remains a significant issue in Indonesia. This situation is exacerbated by limited access to consultation services that are easy, fast, affordable, and available at any time. As a result, many dental diseases go undetected at an early stage. Additionally, the tendency to delay dental check-ups is often caused by time constraints and the distance to healthcare facilities, leading many people to avoid consulting with dentists. To address this problem, this research developed a dental health chatbot based on Natural Language Processing (NLP) using IndoBERT. The model was pretrained with the Masked Language Model (MLM) approach and fine-tuned using Next Sentence Prediction (NSP) and intent classification tasks. The dataset was compiled from Indonesian-language dental health articles, symptom–disease sentence pairs, and follow-up questions, all validated by certified dentists. The system was implemented as a web application using React JS for the frontend, Express JS and MySQL for the backend, and integrated with the NLP model through a Flask API. Evaluation results show that the chatbot can provide relevant dental health information, including lightweight consultations to assist in early symptom detection, answer follow-up questions, and generate digital medical records. Expert validation produced an average score of “Good” across the aspects of clarity, relevance, medical accuracy, and completeness, with Likert scale scores ranging from 3.53 to 3.67. This research is expected to contribute as an accessible 24-hour online dental health information service aimed at increasing public knowledge and awareness.

References

Bharti, Urmil, Deepali Bajaj, Hunar Batra, Shreya Lalit, Shweta Lalit, and Aayushi Gangwani. 2020. “Medbot: Conversational Artificial Intelligence Powered Chatbot for Delivering Tele-Health after Covid-19.” Proceedings of the 5th International Conference on Communication and Electronics Systems, ICCES 2020 (Icces):870–75. doi: 10.1109/ICCES48766.2020.09137944.

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. “Language Models Are Few-Shot Learners.” Advances in Neural Information Processing Systems 2020-Decem.

Budiaji, Weksi. 2013. “The Measurement Scale and The Number of Responses in Likert Scale.” Jurnal Ilmu Pertanian Dan Perikanan Desember 2(2):125–31.

Devlin, Jacob, Ming-Wei Chang, Kenton Lee, Kristina Toutanova Google, and A. I. Language. 2018. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” Naacl-Hlt 2019 (Mlm):4171–86.

Himawati, Marlin. 2023. “Upaya Peningkatan Kesadaran Menjaga Kesehatan Gigi Dan Mulut Di Wilayah Kerja Puskesmas Leuwigajah Dengan Program Penyuluhan.” Jurnal Abdimas Kartika Wijayakusuma 4(2):130–36. doi: 10.26874/jakw.v4i2.304.

Ji, Shaoxiong, Tianlin Zhang, Luna Ansari, Jie Fu, Prayag Tiwari, and Erik Cambria. 2022. “MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare.” 2022 Language Resources and Evaluation Conference, LREC 2022 (June):7184–90.

Joshi, Ankur, Saket Kale, Satish Chandel, and D. Pal. 2015. “Likert Scale: Explored and Explained.” British Journal of Applied Science & Technology 7(4):396–403. doi: 10.9734/bjast/2015/14975.

Kumar, Pranjal. 2024. Large Language Models (LLMs): Survey, Technical Frameworks, and Future Challenges. Vol. 57. Springer Netherlands.

Ouyang, Long, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. “Training Language Models to Follow Instructions with Human Feedback.” Advances in Neural Information Processing Systems 35.

Prasanti, Ditha, and Sri Seti Indriani. 2018. “Pengembangan Teknologi Informasi Dan Komunikasi Dalam Sistem E-Health Alodokter.Com the Use of Information and Communication Technology in E-Health System Alodokter.Com.” Jurnal Sosioteknologi 17(1):93–103.

Ramadhan, Akhmad, Rahmad Arifin, Isnur Hatta, Riky Hamdani, and Nurdiana Dewi. 2023. “HUBUNGAN PENGETAHUAN KESEHATAN GIGI DAN MULUT DENGAN KEHILANGAN GIGI DI WILAYAH KERJA PUSKESMAS SEMANGAT DALAM.” Dentin 7. doi: 10.20527/dentin.v7i3.10746.

Sabrina Putri, Saskia, and Erma Sofiani. 2023. “Peningkatan Pengetahuan Kesehatan Gigi Dan Mulut Melalui Improving Dental and Oral Health Through Counseling on Students of Madrasah Ibtida ’ Iyah.” 8(6):947–53.

Sommerville, I. 2011. Software Engineering (9th Ed.; Boston, Ed.). Massachusetts: Pearson Education.

Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. “LLaMA: Open and Efficient Foundation Language Models.”

Tri Haryanto, Agus. 2024. “APJII Jumlah Pengguna Internet Indonesia Tembus 221 Juta Orang.” APJII. Retrieved (https://apjii.or.id/berita/d/apjii-jumlah-pengguna-internet-indonesia-tembus-221-juta-orang).

Yunianto, Tri Kurnia. 2023. “Tertinggi Di Asia, Kenaikan Biaya Medis RI 13,6% Pada Tahun 2023.” Marketeers, 1.

Downloads

Published

2025-08-02

How to Cite

Isya, N. A., Rasim, R., & Anisyah, A. (2025). A Dental Chatbot Based on IndoBERT with Next Sentence Prediction and Intent Classification. Brilliance: Research of Artificial Intelligence, 5(2), 663–673. https://doi.org/10.47709/brilliance.v5i2.6620