Malay Named Entity Recognition System Using Machine Learning for Tourism in Malaysia
DOI:
https://doi.org/10.32890/jdsd2024.2.2.4Abstract
Analysing unstructured textual data has become increasingly common due to its rich informational value across various fields. Named Entity Recognition (NER) is crucial for identifying entities in open-domain text documents. Current NER techniques often rely on manually labelled documents, which are timeconsuming and prone to inaccuracies. While methods such as Spacy and Polyglot have been used, more research is needed on applying machine learning techniques to this problem. This work addressed this gap by developing a Malay language NER system using machine learning. The system used available Malay corpus resources to identify, learn, tag, and store entities from Malay texts. It was designed to handle structured and unstructured data, extracting names of people, places, organisations, and other entities. The Malay NER System using Machine Learning was developed as a web-based application. It employed advanced machine learning models, specifically BERT and ALXLNET, to process and analyse data. This study shows good agreement among the respondents regarding the usability, perception, and feedback on the specific pages, with the lowest mean score being 76.67%. Regarding system functionalities, there is room for refinement to ensure more accurate and reliable output. The system featured a web interface allowing users to input Malay text and receive recognised entities as output. Performance was assessed using standard evaluation metrics. This work advanced natural language processing capabilities in Malay by creating a user-friendly, efficient tool for NER in Malay.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Juhaida Abu Bakar

This work is licensed under a Creative Commons Attribution 4.0 International License.







