Optimizing Question-Answering Framework Through Integration of Text Summarization Model and Third-Generation Generative Pre-Trained Transformer

Ervin Gubin Moung, Toh Sin Tong, Maisarah Mohd Sufian, Valentino Liaw, Ali Farzamnia, Farashazillah Yahya

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This research project addresses the growing demand for efficient data access amidst the surge in digital information. Conventional keyword-based search engines face limitations, driving the exploration of advanced natural language processing (NLP) approaches. The study introduces an algorithm that autonomously extracts data from summary reports, utilizing NLP and information retrieval as a question-answering API. Evaluated using Recall-Oriented Understudy for Gisting Evaluation (ROUGE)-1, ROUGE-2, and ROUGE-L scores, PEGASUS achieved the highest average ROUGE score (0.432) with a single sample, while BART attained the highest multi-sample score (0.302) with 1000 samples. The research emphasized optimal hyperparameters in pre-trained models, specifically the impact of batch size on completion time and the relationship between maximum sequence length and ROUGE scores. The study enhances question-answering systems for efficient information retrieval, with practical applications in sectors like legal analysis, healthcare, and business intelligence. This study not only improves the efficiency and accuracy of QA systems but also offers valuable insights for future advancements in NLP-driven information extraction. The refined methodologies and enhanced performance metrics provide a promising avenue for transforming how organizations handle large-scale data, driving innovation in both computational efficiency and user experience.

Original languageEnglish
Title of host publication2024 14th International Conference on Computer and Knowledge Engineering, ICCKE 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages296-301
Number of pages6
ISBN (Electronic)9798331511272
ISBN (Print)9798331511289
DOIs
Publication statusPublished - 18 Feb 2025
Event14th International Conference on Computer and Knowledge Engineering - Mashhad, Iran, Islamic Republic of
Duration: 19 Nov 202420 Nov 2024
Conference number: 14

Publication series

NameInternational Conference on Computer and Knowledge Engineering, ICCKE
PublisherIEEE
ISSN (Print)2375-1304
ISSN (Electronic)2643-279X

Conference

Conference14th International Conference on Computer and Knowledge Engineering
Abbreviated titleICCKE 2024
Country/TerritoryIran, Islamic Republic of
CityMashhad
Period19/11/2420/11/24

Cite this