Deep Question Classifier and Explainable AI for Detection of Potential Problems in Training Data

Please use this identifier to cite or link to this item: http://ithesis-ir.su.ac.th/dspace/handle/123456789/4927

Title:	Deep Question Classifier and Explainable AI for Detection of Potential Problems in Training Data ตัวแยกประเภทเชิงลึกสำหรับคำถามและการใช้เอไอเชิงอธิบาย สำหรับการตรวจพบปัญหาที่อาจเกิดขึ้นได้ในข้อมูลฝึก
Authors:	Aphinan PEERACHAIDACHO อภินันท์ พีรชัยเดโช Tasanawan Soonklang ทัศนวรรณ ศูนย์กลาง Silpakorn University Tasanawan Soonklang ทัศนวรรณ ศูนย์กลาง soonklang_t@su.ac.th soonklang_t@su.ac.th
Keywords:	การเรียนรู้เชิงลึก การประมวลภาษาธรรมชาติ ความสามารถในการเข้าใจ deep learning natural language processing interpretability
Issue Date:	24
Publisher:	Silpakorn University
Abstract:	Currently, the processing of natural language holds diverse applications with distinct limitations. A common issue in specialized tasks is the limited availability of data, necessitating the search for appropriate models and datasets that align with these constraints. For event platform service providers, a prevalent challenge is the abundance of unorganized questions in the database. These questions often exhibit repetition and lack proper categorization. This research presents the development of a deep learning model for question categorization within event-related content using a CNN-BiLSTM hybrid neural network. Experimental results demonstrate that the presented model consistently outperforms other existing models, exhibiting significant improvements in performance. Furthermore, a method is proposed to identify potential issues within the training dataset by utilizing interpretability through artificial intelligence. This approach facilitates the explanation of the model's prediction outcomes, aiding researchers in better understanding the model's behavior. This, in turn, enables the researchers to analyze and address the model's performance more effectively. As the dataset quality improves, it enhances the model's predictive capabilities, resulting in better prediction outcomes. ปัจจุบันการประมวลภาษาธรรมชาติมีขอบเขตการใช้งานเฉพาะทางที่หลากหลาย ซึ่งปัญหาในงานเฉพาะทางที่พบได้บ่อยคือปริมาณข้อมูลมีจำกัด ทำให้มีความจำเป็นในการหาแบบจำลองและชุดข้อมูลที่เหมาะสมกับข้อจำกัดดังกล่าว สำหรับผู้ให้บริการแพล็ตฟอร์มด้านงานอีเว้นท์ ปัญหาที่พบคือมีคำถามอยู่จำนวนมากในฐานข้อมูล มีความซ้ำซ้อน และไม่ได้รับการจัดประเภทคำถาม งานวิจัยนี้จึงนำเสนอการพัฒนาแบบจำลองเชิงลึกสำหรับการจำแนกประเภทคำถามของงานอีเว้นท์ด้วยโครงข่ายประสาทเทียมแบบผสม CNN-BiLSTM ผลลัพธ์การทดลองแสดงให้เห็นว่าแบบจำลองที่เรานำเสนอมีประสิทธิภาพเหนือกว่าแบบจำลองอื่นๆอย่างเห็นได้ชัด นอกจากนี้เราได้นำเสนอวิธีการตรวจหาปัญหาที่อาจเกิดขึ้นกับชุดข้อมูลฝึกด้วยการใช้เอไอเชิงอธิบาย วิธีการของเราสามารถอธิบายผลลัพธ์การทำนายของแบบจำลองเพื่อช่วยให้ผู้วิจัยสามารถเข้าใจพฤติกรรมการทำงานของแบบจำลองได้มากยิ่งขึ้น และนำปัญหาที่พบไปพิจารณาเพื่อปรับปรุงคุณภาพของชุดข้อมูลฝึก เมื่อข้อมูลมีคุณภาพจะส่งผลให้แบบจำลองมีประสิทธิภาพการทำนายผลลัพธ์ได้ดียิ่งขึ้น
URI:	http://ithesis-ir.su.ac.th/dspace/handle/123456789/4927
Appears in Collections:	Science

Files in This Item:

File	Description	Size	Format
61318305.pdf		3.43 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets