Topic Modeling in Conversational Dialogs for Naming Intent Labels Using LDA

Pemodelan Topik pada Dialog Percakapan untuk Penamaan Label Intent Menggunakan LDA

  • Laksma Wiramurti Narendra Institut Sains Terapan dan Teknologi Surabaya

Abstract

Penelitian chatbot semakin berkembang dalam beberapa tahun ini seiring dengan perkembangan teknologi Machine Learning (ML) dan Artificial Intelligent (AI). Natural Language Processing (NLP) sebagai bagian dari ML digunakan oleh chatbot terutama pada tugas Natural Language Understanding (NLU). Chatbot memanfaatkan pengklasifikasian intent untuk memahami maksud pada pesan yang dikirim pengguna. Untuk menjadikan chatbot berfungsi dengan baik sesuai dengan domainnya maka pemetaan intent pada data pelatihan model menjadi permasalahan tersendiri bagi para peneliti. Hal ini disebabkan dataset berlabel intent untuk pelatihan model chatbot dalam bahasa Indonesia masih jarang tersedia. Pada penelitian ini, penamaan intent untuk data pelatihan chatbot dapat dibuat dengan menggunakan metode Latent Dirichlet Allocation (LDA), dataset pertanyaan diambil dari log komplain salah satu distributor pulsa di Indonesia sejumlah 143.520 pesan sejak 2015 hingga 2019. Dari hasil pemodelan topik menggunakan LDA mampu memetakan 8 topik yang kemudian dapat digunakan dalam penamaan intent pada pelatihan model chatbot.

References

[1] N. Akma, M. Hafiz, A. Zainal, M. Fairuz, and Z. Adnan, “Review of Chatbots Design Techniques,” Int. J. Comput. Appl., vol. 181, no. 8, pp. 7–10, Aug. 2018, doi: 10.5120/ijca2018917606.

[2] S. Alias, M. S. Sainin, T. S. Fun, and N. Daut, “Intent Pattern Discovery for Academic Chatbot - A Comparison between N-gram model and Frequent Pattern-Growth method,” in 2019 IEEE 6th International Conference on Engineering Technologies and Applied Sciences (ICETAS), Kuala Lumpur, Malaysia, Dec. 2019, pp. 1–5. doi: 10.1109/ICETAS48360.2019.9117315.

[3] J. Agassi and J. Wiezenbaum, “Computer Power and Human Reason: From Judgment to Calculation,” Technol. Cult., vol. 17, no. 4, p. 813, Oct. 1976, doi: 10.2307/3103715.

[4] N. T. M. Trang and M. Shcherbakov, “Enhancing Rasa NLU model for Vietnamese chatbot,” vol. 9, p. 7, 2021.

[5] S. Sahay, S. H. Kumar, E. Okur, H. Syed, and L. Nachman, “Modeling Intent, Dialog Policies and Response Adaptation for Goal-Oriented Interactions,” ArXiv191210130 Cs, Dec. 2019, Accessed: Jul. 05, 2021. [Online]. Available: http://arxiv.org/abs/1912.10130

[6] A. Jiao, “An Intelligent Chatbot System Based on Entity Extraction Using RASA NLU and Neural Network,” J. Phys. Conf. Ser., vol. 1487, p. 012014, Mar. 2020, doi: 10.1088/1742-6596/1487/1/012014.

[7] D. Theosaksomo and D. H. Widyantoro, “Conversational Recommender System Chatbot Based on Functional Requirement,” in 2019 IEEE 13th International Conference on Telecommunication Systems, Services, and Applications (TSSA), Bali, Indonesia, Oct. 2019, pp. 154–159. doi: 10.1109/TSSA48701.2019.8985467.

[8] T. Bocklisch, J. Faulkner, N. Pawlowski, and A. Nichol, “Rasa: Open Source Language Understanding and Dialogue Management,” ArXiv171205181 Cs, Dec. 2017, Accessed: Jul. 05, 2021. [Online]. Available: http://arxiv.org/abs/1712.05181

[9] J.-K. Kim, G. Tur, A. Celikyilmaz, B. Cao, and Y.-Y. Wang, “Intent detection using semantically enriched word embeddings,” in 2016 IEEE Spoken Language Technology Workshop (SLT), San Diego, CA, Dec. 2016, pp. 414–419. doi: 10.1109/SLT.2016.7846297.

[10] M. Maryamah, A. Z. Arifin, R. Sarno, and R. W. Sholikah, “Enhanced Topic Modelling using Dictionary For Questions and Answers Problem,” in 2019 12th International Conference on Information & Communication Technology and System (ICTS), Surabaya, Indonesia, Jul. 2019, pp. 219–223. doi: 10.1109/ICTS.2019.8850986.

[11] H. Jelodar et al., “Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey,” Multimed. Tools Appl., vol. 78, no. 11, pp. 15169–15211, Jun. 2019, doi: 10.1007/s11042-018-6894-4.

[12] A. P. Sam, B. Singh, and A. S. Das, “A Robust Methodology for Building an Artificial Intelligent (AI) Virtual Assistant for Payment Processing,” in 2019 IEEE Technology & Engineering Management Conference (TEMSCON), Atlanta, GA, USA, Jun. 2019, pp. 1–6. doi: 10.1109/TEMSCON.2019.8813584.

[13] T.-E. Lin, H. Xu, and H. Zhang, “Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement,” ArXiv191108891 Cs, Nov. 2019, Accessed: Jul. 06, 2021. [Online]. Available: http://arxiv.org/abs/1911.08891

[14] D. M. Blei, “Latent Dirichlet Allocation,” p. 30.

[15] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” ArXiv13013781 Cs, Sep. 2013, Accessed: Jul. 06, 2021. [Online]. Available: http://arxiv.org/abs/1301.3781

[16] Putra Pandu Adikara. 2012. Kamus Kata Dasar dan Stopword List Bahasa Indonesia. http://hikaruyuuki.lecture.ub.ac.id/kamus-kata-dasar-dan-stopword-list-bahasa-indonesia diakses pada 11 Nopember 2020
Published
2022-02-15
How to Cite
NARENDRA, Laksma Wiramurti. Topic Modeling in Conversational Dialogs for Naming Intent Labels Using LDA. Jurnal Sistem Telekomunikasi Elektronika Sistem Kontrol Power Sistem dan Komputer, [S.l.], v. 2, n. 1, p. 65-74, feb. 2022. ISSN 2776-6195. Available at: <https://ejournal.uniska-kediri.ac.id/index.php/JTECS/article/view/1820>. Date accessed: 03 feb. 2025. doi: https://doi.org/10.32503/jtecs.v2i1.1820.
Section
Komputer

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.