Temperament detection based on Twitter data: classical machine learning versus deep learning

Annisa Ulizulfa; Retno Kusumaningrum; Khadijah Khadijah; Rismiyati Rismiyati

doi:10.26555/ijain.v8i1.692


Temperament detection based on Twitter data: classical machine learning versus deep learning

⁽¹⁾ Annisa Ulizulfa

(Department of Informatics, Universitas Diponegoro, Indonesia)
^{(2) *} Retno Kusumaningrum

(Department of Informatics, Universitas Diponegoro, Indonesia)
⁽³⁾ Khadijah Khadijah

(Department of Informatics, Universitas Diponegoro, Indonesia)
⁽⁴⁾ Rismiyati Rismiyati

(Department of Informatics, Universitas Diponegoro, Indonesia)
^*corresponding author

Abstract

Deep learning has shown promising results in various text-based classification tasks. However, deep learning performance is affected by the number of data, i.e., when the number of data is small, deep learning algorithms do not perform well, and vice versa. Classical machine learning algorithms commonly work well for a few data, and their performance reaches an optimal value and does not increase with the increase in sample data. Therefore, this study aimed to compare the performance of classical machine learning and deep learning methods to detect temperament based on Indonesian Twitter. In this study, the proposed Indonesian Linguistic Inquiry and Word Count were employed to analyze the context of Twitter. The classical machine learning methods implemented were support vector machine and K-nearest neighbor, whereas the deep learning method employed was a convolutional neural network (CNN) with three different architectures. Both learning methods were implemented using multiclass classification and one versus all (OVA) multiclass classification. The highest average f-measure was 58.73%, obtained by CNN OVA with a pool size of 3, a dropout value of 0.7, and a learning rate value of 0.0007.

Keywords

Temperament detection; twitter user; Support vector machine; K-nearest neighbour; Convolutional neural network

DOI

https://doi.org/10.26555/ijain.v8i1.692

Article metrics

Abstract views : 1978 | PDF views : 365

Cite

How to cite item

Full Text

Download

References

[1] D. Keirsey, Please understand me II: Temperament, character, intelligence. Prometheus Nemesis Book Company, 1998. Available at: Google Scholar.

[2] N. Majumder, S. Poria, A. Gelbukh, and E. Cambria, â€œDeep Learning-Based Document Modeling for Personality Detection from Text,â€ IEEE Intell. Syst., vol. 32, no. 2, pp. 74â€“79, Mar. 2017, doi: 10.1109/MIS.2017.23.

[3] D. Xue et al., â€œPersonality Recognition on Social Media With Label Distribution Learning,â€ IEEE Access, vol. 5, pp. 13478â€“13488, 2017, doi: 10.1109/ACCESS.2017.2719018.

[4] S. C. Guntuku, D. B. Yaden, M. L. Kern, L. H. Ungar, and J. C. Eichstaedt, â€œDetecting depression and mental illness on social media: an integrative review,â€ Curr. Opin. Behav. Sci., vol. 18, pp. 43â€“49, Dec. 2017, doi: 10.1016/j.cobeha.2017.07.005.

[5] Y. Win, â€œClassification using Support Vector Machine to Detect Cyberbullying in Social Media for Myanmar Language,â€ in 2019 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia), 2019, pp. 122â€“125, doi: 10.1109/ICCE-Asia46551.2019.8942212.

[6] A. S. M. Alharbi and E. de Doncker, â€œTwitter sentiment analysis with a deep neural network: An enhanced approach using user behavioral information,â€ Cogn. Syst. Res., vol. 54, pp. 50â€“61, May 2019, doi: 10.1016/j.cogsys.2018.10.001.

[7] M. L. Kern, P. X. McCarthy, D. Chakrabarty, and M.-A. Rizoiu, â€œSocial media-predicted personality traits and values can help match people to their ideal jobs,â€ Proc. Natl. Acad. Sci., vol. 116, no. 52, pp. 26459â€“26464, Dec. 2019, doi: 10.1073/pnas.1917942116.

[8] B. Y. Pratama and R. Sarno, â€œPersonality classification based on Twitter text using Naive Bayes, KNN and SVM,â€ in 2015 International Conference on Data and Software Engineering (ICoDSE), 2015, pp. 170â€“174, doi: 10.1109/ICODSE.2015.7436992.

[9] I. A. Harsehanto and M. D. R. Wahyudi, â€œAnalysis of Personality Characteristic Using the NaÃ¯ve Bayess Classifier Algorithm (Case Study Official Twitter of Basuki Tjahaja Purnamaâ€™s and Anies Baswedan),â€ IJID (International J. Informatics Dev., vol. 7, no. 2, p. 14, Jan. 2019, doi: 10.14421/ijid.2018.07203.

[10] A. C. E. S. Lima and L. N. de Castro, â€œPredicting Temperament from Twitter Data,â€ in 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), 2016, pp. 599â€“604, doi: 10.1109/IIAI-AAI.2016.239.

[11] S. C. Guntuku, W. Lin, J. Carpenter, W. K. Ng, L. H. Ungar, and D. PreoÅ£iuc-Pietro, â€œStudying Personality through the Content of Posted and Liked Images on Twitter,â€ in Proceedings of the 2017 ACM on Web Science Conference, 2017, pp. 223â€“227, doi: 10.1145/3091478.3091522.

[12] A. K. John, A. Adewale M., and E. Chinnasa, â€œTemperament and Mood Detection Using Case-Based Reasoning,â€ Int. J. Intell. Syst. Appl., vol. 6, no. 3, pp. 50â€“61, Feb. 2014, doi: 10.5815/ijisa.2014.03.05.

[13] C. F. Claro, A. C. E. S. Lima, and L. N. de Castro, â€œPredicting Temperament using Keirseyâ€™s Model for Portuguese Twitter Data,â€ in Proceedings of the 10th International Conference on Agents and Artificial Intelligence, 2018, pp. 250â€“256, doi: 10.5220/0006700102500256.

[14] A. C. E. S. Lima and L. N. de Castro, â€œTECLA: A temperament and psychological type prediction framework from Twitter data,â€ PLoS One, vol. 14, no. 3, p. e0212844, Mar. 2019, doi: 10.1371/journal.pone.0212844.

[15] D. Suhartono et al., â€œPersonality Prediction Based on Twitter Information in Bahasa Indonesia,â€ 2017, pp. 367â€“372, doi: 10.15439/2017F359.

[16] A. Husseini Orabi, P. Buddhitha, M. Husseini Orabi, and D. Inkpen, â€œDeep Learning for Depression Detection of Twitter Users,â€ in Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 2018, pp. 88â€“97, doi: 10.18653/v1/W18-0609.

[17] S. Liao, J. Wang, R. Yu, K. Sato, and Z. Cheng, â€œCNN for situations understanding based on sentiment analysis of twitter data,â€ Procedia Comput. Sci., vol. 111, pp. 376â€“381, 2017, doi: 10.1016/j.procs.2017.06.037.

[18] J. W. Pennebaker, R. L. Boyd, K. Jordan, and K. Blackburn, â€œThe development and psychometric properties of LIWC2015,â€ 2015. Available at: Google Scholar.

[19] P. Pawara, E. Okafor, M. Groefsema, S. He, L. R. B. Schomaker, and M. A. Wiering, â€œOne-vs-One classification for deep neural networks,â€ Pattern Recognit., vol. 108, p. 107528, Dec. 2020, doi: 10.1016/j.patcog.2020.107528.

[20] Y. R. Tausczik and J. W. Pennebaker, â€œThe Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods,â€ J. Lang. Soc. Psychol., vol. 29, no. 1, pp. 24â€“54, Mar. 2010, doi: 10.1177/0261927X09351676.

[21] D. P. DudÄƒu and F. A. Sava, â€œThe development and validation of the Romanian version of Linguistic Inquiry and Word Count 2015 (Ro-LIWC2015),â€ Curr. Psychol., Jun. 2020, doi: 10.1007/s12144-020-00872-4.

[22] G. Orellana, B. Arias, M. Orellana, V. Saquicela, F. Baculima, and N. Piedra, â€œA Study on the Impact of Pre-Processing Techniques in Spanish and English Text Classification over Short and Large Text Documents,â€ in 2018 International Conference on Information Systems and Computer Science (INCISCOS), 2018, pp. 277â€“283, doi: 10.1109/INCISCOS.2018.00047.

[23] Y. A. Putra and M. L. Khodra, â€œDeep learning and distributional semantic model for Indonesian tweet categorization,â€ in 2016 International Conference on Data and Software Engineering (ICoDSE), 2016, pp. 1â€“6, doi: 10.1109/ICODSE.2016.7936108.

[24] R. M. Cahyaningtyas, R. Kusumaningrum, Sutikno, Suhartono, and D. E. Riyanto, â€œEmotion detection of tweets in Indonesian language using LDA and expression symbol conversion,â€ in 2017 1st International Conference on Informatics and Computational Sciences (ICICoS), 2017, pp. 253â€“258, doi: 10.1109/ICICOS.2017.8276371.

[25] R. B. S. Putra and E. Utami, â€œNon-formal affixed word stemming in Indonesian language,â€ in 2018 International Conference on Information and Communications Technology (ICOIACT), 2018, pp. 531â€“536, doi: 10.1109/ICOIACT.2018.8350735.

[26] X. H. Cao, I. Stojkovic, and Z. Obradovic, â€œA robust data scaling algorithm to improve classification accuracies in biomedical data,â€ BMC Bioinformatics, vol. 17, no. 1, p. 359, Dec. 2016, doi: 10.1186/s12859-016-1236-x.

[27] G. Aksu, C. O. GÃ¼zeller, and M. T. Eser, â€œThe Effect of the Normalization Method Used in Different Sample Sizes on the Success of Artificial Neural Network Model,â€ Int. J. Assess. Tools Educ., pp. 170â€“192, Apr. 2019, doi: 10.21449/ijate.479404.

[28] M. Faisal, E. M. Zamzami, and Sutarman, â€œComparative Analysis of Inter-Centroid K-Means Performance using Euclidean Distance, Canberra Distance and Manhattan Distance,â€ J. Phys. Conf. Ser., vol. 1566, no. 1, p. 012112, Jun. 2020, doi: 10.1088/1742-6596/1566/1/012112.

[29] I. Bin Mohamad and D. Usman, â€œStandardization and Its Effects on K-Means Clustering Algorithm,â€ Res. J. Appl. Sci. Eng. Technol., vol. 6, no. 17, pp. 3299â€“3303, Sep. 2013, doi: 10.19026/rjaset.6.3638.

[30] P. Refaeilzadeh, L. Tang, and H. Liu, â€œCross-Validation,â€ in Encyclopedia of Database Systems, Boston, MA: Springer US, 2009, pp. 532â€“538. doi: 10.1007/978-0-387-39940-9_565.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571 (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
andri.pranolo.id@ieee.org (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

Username
Password
Remember me