Sentiment analysis of Indonesian hotel reviews: from classical machine learning to deep learning

(1) * Retno Kusumaningrum Mail (Universitas Diponegoro, Indonesia)
(2) Iffa Zainan Nisa Mail (Universitas Diponegoro, Indonesia)
(3) Rizka Putri Nawangsari Mail (Universitas Diponegoro, Indonesia)
(4) Adi Wibowo Mail (Universitas Diponegoro, Indonesia)
*corresponding author


Currently, there are a large number of hotel reviews on the Internet that need to be evaluated to turn the data into practicable information. Deep learning has excellent capabilities for recognizing this type of data. With the advances in deep learning paradigms, many algorithms have been developed that can be used in sentiment analysis tasks. In this study, we aim to compare the performance of classical machine learning algorithms—logistic regression (LR), naïve Bayes (NB), and support vector machine (SVM) using the Word2Vec model in conjunction with deep learning algorithms such as a convolutional neural network (CNN) to classify hotel reviews on the Traveloka website into positive or negative classes. Both learning methods apply hyperparameter tuning to determine the parameters that produce the best model. Furthermore, the Word2Vec model parameters use the skip-gram model, hierarchical softmax evaluation, and the value of 100 vector dimensions. The highest average accuracy obtained was 98.08% by using the CNN with a dropout of 0.2, Tanh as convolution activation, softmax as output activation, and Adam as the optimizer. The findings from the study demonstrate that the integration of the Word2Vec model and the CNN model obtains significantly better accuracy than other classical machine learning methods.


sentiment analysis; word2vec; convoluational neural network; classical machine learning; hotel reviews



Article metrics

Abstract views : 349 | PDF views : 87




Full Text



[1] N. Akhtar, N. Zubair, A. Kumar, and T. Ahmad, “Aspect based Sentiment Oriented Summarization of Hotel Reviews,” in Procedia Computer Science, 2017, vol. 115, pp. 563–571, doi: 10.1016/j.procs.2017.09.115.

[2] D. Anand and D. Naorem, “Semi-supervised Aspect Based Sentiment Analysis for Movies Using Review Filtering,” in Procedia Computer Science, 2016, vol. 84, pp. 86–93, doi: 10.1016/j.procs.2016.04.070.

[3] E. Wahyudi and R. Kusumaningrum, “Aspect Based Sentiment Analysis in E-Commerce User Reviews Using Latent Dirichlet Allocation (LDA) and Sentiment Lexicon,” in 2019 3rd International Conference on Informatics and Computational Sciences (ICICoS), 2019, pp. 1–6, doi: 10.1109/ICICoS48119.2019.8982522.

[4] Rahul, V. Raj, and Monika, “Sentiment Analysis on Product Reviews,” in 2019 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), 2019, pp. 5–9, doi: 10.1109/ICCCIS48478.2019.8974527.

[5] Indriati, A. Kusyanti, and D. Zakia, “Sentiment Analysis in the Mobile Application Review Document Using the Improved K-Nearest Neighbor Method,” in 2019 International Conference on Sustainable Information Engineering and Technology (SIET), 2019, pp. 332–337, doi: 10.1109/SIET48054.2019.8986037.

[6] F. R. Saputra Rangkuti, M. A. Fauzi, Y. A. Sari, and E. D. L. Sari, “Sentiment Analysis on Movie Reviews Using Ensemble Features and Pearson Correlation Based Feature Selection,” Nov. 2018, doi: 10.1109/SIET.2018.8693211.

[7] I. P. Windasari and D. Eridani, “Sentiment analysis on travel destination in Indonesia,” in 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), 2017, pp. 276–279, doi: 10.1109/ICITACEE.2017.8257717.

[8] R. A. Laksono, K. R. Sungkono, R. Sarno, and C. S. Wahyuni, “Sentiment Analysis of Restaurant Customer Reviews on TripAdvisor using Naïve Bayes,” in 2019 12th International Conference on Information Communication Technology and System (ICTS), 2019, pp. 49–54, doi: 10.1109/ICTS.2019.8850982.

[9] R. Moraes, J. F. Valiati, and W. P. Gavião Neto, “Document-level sentiment classification: An empirical comparison between SVM and ANN,” Expert Systems with Applications, vol. 40, no. 2, pp. 621–633, Feb. 2013, doi: 10.1016/j.eswa.2012.07.059.

[10] A. Tripathy, A. Anand, and S. K. Rath, “Document-level sentiment classification using hybrid machine learning approach,” Knowledge and Information Systems, vol. 53, no. 3, pp. 805–831, Dec. 2017, doi: 10.1007/s10115-017-1055-z.

[11] P. F. Muhammad, R. Kusumaningrum, and A. Wibowo, “Sentiment Analysis Using Word2vec and Long Short-Term Memory (LSTM) for Indonesian Hotel Reviews,” in Procedia Computer Science, 2021, vol. 179, pp. 728–735, doi: 10.1016/j.procs.2021.01.061.

[12] A. R. Naradhipa and A. Purwarianti, “Sentiment classification for Indonesian message in social media,” Apr. 2012, doi: 10.1109/ICCCSN.2012.6215730.

[13] S. Kurniawan, R. Kusumaningrum, and M. E. Timu, “Hierarchical Sentence Sentiment Analysis Of Hotel Reviews Using The Naïve Bayes Classifier,” in 2018 2nd International Conference on Informatics and Computational Sciences (ICICoS), 2018, pp. 1–5, doi: 10.1109/ICICOS.2018.8621748.

[14] N. Farra, E. Challita, R. A. Assi, and H. Hajj, “Sentence-Level and Document-Level Sentiment Mining for Arabic Texts,” Dec. 2010, doi: 10.1109/ICDMW.2010.95.

[15] L. P. Manik et al., “Aspect-Based Sentiment Analysis on Candidate Character Traits in Indonesian Presidential Election,” Nov. 2020, doi: 10.1109/ICRAMET51080.2020.9298595.

[16] S. Gojali and M. L. Khodra, “Aspect based sentiment analysis for review rating prediction,” Aug. 2016, doi: 10.1109/ICAICTA.2016.7803110.

[17] A. N. Azhar, M. L. Khodra, and A. P. Sutiono, “Multi-label Aspect Categorization with Convolutional Neural Networks and Extreme Gradient Boosting,” Jul. 2019, doi: 10.1109/ICEEI47359.2019.8988898.

[18] D. I. Af’idah, R. Kusumaningrum, and B. Surarso, “Long Short Term Memory Convolutional Neural Network for Indonesian Sentiment Analysis towards Touristic Destination Reviews,” in 2020 International Seminar on Application for Technology of Information and Communication (iSemantic), 2020, pp. 630–637, doi: 10.1109/iSemantic50169.2020.9234210.

[19] W. Satriaji and R. Kusumaningrum, “Effect of Synthetic Minority Oversampling Technique (SMOTE), Feature Representation, and Classification Algorithm on Imbalanced Sentiment Analysis,” in 2018 2nd International Conference on Informatics and Computational Sciences (ICICoS), 2018, pp. 1–5, doi: 10.1109/ICICOS.2018.8621648.

[20] B. Y. Pratama and R. Sarno, “Personality classification based on Twitter text using Naive Bayes, KNN and SVM,” in 2015 International Conference on Data and Software Engineering (ICoDSE), 2015, pp. 170–174, doi: 10.1109/ICODSE.2015.7436992.

[21] E. Lunando and A. Purwarianti, “Indonesian social media sentiment analysis with sarcasm detection,” in 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2013, pp. 195–198, doi: 10.1109/ICACSIS.2013.6761575.

[22] A. Cahyadi and M. L. Khodra, “Aspect-Based Sentiment Analysis Using Convolutional Neural Network and Bidirectional Long Short-Term Memory,” in 2018 5th International Conference on Advanced Informatics: Concept Theory and Applications (ICAICTA), 2018, pp. 124–129, doi: 10.1109/ICAICTA.2018.8541300.

[23] A. Ilmania, Abdurrahman, S. Cahyawijaya, and A. Purwarianti, “Aspect Detection and Sentiment Classification Using Deep Neural Network for Indonesian Aspect-Based Sentiment Analysis,” in 2018 International Conference on Asian Language Processing (IALP), 2018, pp. 62–67, doi: 10.1109/IALP.2018.8629181.

[24] A. Ligthart, C. Catal, and B. Tekinerdogan, “Systematic reviews in sentiment analysis: a tertiary study,” Artificial Intelligence Review, Mar. 2021, doi: 10.1007/s10462-021-09973-3.

[25] R. P. Nawangsari, R. Kusumaningrum, and A. Wibowo, “Word2vec for Indonesian sentiment analysis towards hotel reviews: An evaluation study,” in Procedia Computer Science, 2019, vol. 157, pp. 360–366, doi: 10.1016/j.procs.2019.08.178.

[26] M. Mhatre, D. Phondekar, P. Kadam, A. Chawathe, and K. Ghag, “Dimensionality reduction for sentiment analysis using pre-processing techniques,” in 2017 International Conference on Computing Methodologies and Communication (ICCMC), 2017, pp. 16–21, doi: 10.1109/ICCMC.2017.8282676.

[27] Y. A. Putra and M. L. Khodra, “Deep learning and distributional semantic model for Indonesian tweet categorization,” in 2016 International Conference on Data and Software Engineering (ICoDSE), 2016, pp. 1–6, doi: 10.1109/ICODSE.2016.7936108.

[28] S. Twinandilla, S. Adhy, B. Surarso, and R. Kusumaningrum, “Multi-Document Summarization Using K-Means and Latent Dirichlet Allocation (LDA) – Significance Sentences,” Procedia Computer Science, vol. 135, 2018, doi: 10.1016/j.procs.2018.08.220.

[29] M. al Omari, M. Al-Hajj, N. Hammami, and A. Sabra, “Sentiment Classifier: Logistic Regression for Arabic Services’ Reviews in Lebanon,” in 2019 International Conference on Computer and Information Sciences (ICCIS), 2019, pp. 1–5, doi: 10.1109/ICCISci.2019.8716394.

[30] H. Hasanli and S. Rustamov, “Sentiment Analysis of Azerbaijani twits Using Logistic Regression, Naive Bayes and SVM,” in 2019 IEEE 13th International Conference on Application of Information and Communication Technologies (AICT), 2019, pp. 1–7, doi: 10.1109/AICT47866.2019.8981793.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by Informatics Department - Universitas Ahmad Dahlan, and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
E: (paper handling issues), (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0