Development of BAPOLAIC: AI chatbot for optical character recognition based-document extraction and voice assistant
Abstract
Conventional chatbots often lack integrated functionalities for complex academic tasks, such as multi-format document handling and multimodal interaction. This paper presents the design, implementation, and performance evaluation of BAPOLAIC, a web-based, multimodal AI assistant developed to address this gap. The system architecture integrates optical character recognition (OCR), a dual-strategy natural language processing (NLP) module, and voice assistance, all orchestrated by the Gemini API. Quantitative evaluation confirmed high performance: the OCR module achieved a 98.69% average accuracy, and the retrieval-based NLP path correctly handled 90% of test queries. Furthermore, the API integration demonstrated exceptional efficiency with a median latency as low as 0.06 ms. Task-based evaluations validated BAPOLAIC's effectiveness in performing intelligent functions like summarization and content-based Q&A, with a superior capacity for handling up to 10 consecutive documents. The results validate BAPOLAIC as a successful proof-of-concept for a specialized academic tool, providing a framework for integrating multiple AI technologies to enhance educational productivity.
Keywords
AI chatbot; Gemini API; Natural language processing; Optical character recognition; Voice assistant
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v16i2.pp1002-1009
Copyright (c) 2026 Rival Fahreji, Ryan Satria Wijaya

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by theĀ Institute of Advanced Engineering and Science (IAES).