Malay Named Entity Recognition System Using Machine Learning for Tourism in Malaysia

Authors

  • Juhaida Abu Bakar
  • Muhammad Asyraf Ariffin
  • Nor Hazlyna Harun
  • Ruziana Mohamad Rasli
  • Nurul Huda Mohamad Saad
  • Lisnawita

DOI:

https://doi.org/10.32890/jdsd2024.2.2.4

Abstract

Analysing unstructured textual data has become increasingly common due to its rich informational value across various fields. Named Entity Recognition (NER) is crucial for identifying entities in open-domain text documents. Current NER techniques often rely on manually labelled documents, which are timeconsuming and prone to inaccuracies. While methods such as Spacy and Polyglot have been used, more research is needed on applying machine learning techniques to this problem. This work addressed this gap by developing a Malay language NER system using machine learning. The system used available Malay corpus resources to identify, learn, tag, and store entities from Malay texts. It was designed to handle structured and unstructured data, extracting names of people, places, organisations, and other entities. The Malay NER System using Machine Learning was developed as a web-based application. It employed advanced machine learning models, specifically BERT and ALXLNET, to process and analyse data. This study shows good agreement among the respondents regarding the usability, perception, and feedback on the specific pages, with the lowest mean score being 76.67%. Regarding system functionalities, there is room for refinement to ensure more accurate and reliable output. The system featured a web interface allowing users to input Malay text and receive recognised entities as output. Performance was assessed using standard evaluation metrics. This work advanced natural language processing capabilities in Malay by creating a user-friendly, efficient tool for NER in Malay. 

Downloads

Published

20-10-2024

Issue

Section

Articles

How to Cite

Malay Named Entity Recognition System Using Machine Learning for Tourism in Malaysia. (2024). Journal of Digital System Development, 2(2), 46-63. https://doi.org/10.32890/jdsd2024.2.2.4

Most read articles by the same author(s)