Emotion recognition Indonesian language from Twitter using IndoBERT and Bi-LSTM

Stephen William, Kenny -, Andry Chowanda

Abstract


Emotions recognition has been a very challenging task and topic in Natural Language Processing area. Nevertheless, there have not been many research have been done in the local language, such as Bahasa Indonesia, compared to the English language. This is due to the lack of resources available publicly. Therefore, we proposed a new labelled dataset scraped from social media platforms (i.e. Twitter), based on Ekman’s six basic emotions (Happy, Fear, Anger, Disgust, Sadness, and Surprise) plus one neutral label. Furthermore, we implemented several fine-tuned models from the IndoBERT model to model the emotions recognition system. Moreover, Bidirectional Long Short-Term Memory (Bi-LSTM) architectures were also implemented to model the emotions recognition system. Finally, the models were trained to the collected dataset (7,629 tweets). The results show that the fine-tuned model resulted in an accuracy score of 92.3%, which outperforms both the baseline model and the Bi-LSTM with the accuracy of 90.7% and 84.0%, respectively.

Full Text: PDF

Published: 2024-07-04

How to Cite this Article:

Stephen William, Kenny -, Andry Chowanda, Emotion recognition Indonesian language from Twitter using IndoBERT and Bi-LSTM, Commun. Math. Biol. Neurosci., 2024 (2024), Article ID 71

Copyright © 2024 Stephen William, Kenny -, Andry Chowanda. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Commun. Math. Biol. Neurosci.

ISSN 2052-2541

Editorial Office: [email protected]

 

Copyright ©2024 CMBN