Hierarchical multi-label news article classification with distributed semantic model based features

Ivana Clairine Irsan; Masayu Leylia Khodra

doi:10.26555/ijain.v5i1.168


Hierarchical multi-label news article classification with distributed semantic model based features

^{(1) *} Ivana Clairine Irsan

(Institut Teknologi Bandung, Indonesia)
⁽²⁾ Masayu Leylia Khodra

(Institut Teknologi Bandung, Indonesia)
^*corresponding author

Abstract

Automatic news categorization is essential to automatically handle the classification of multi-label news articles in online portal. This research employs some potential methods to improve performance of hierarchical multi-label classifier for Indonesian news article. First potential method is using Convolutional Neural Network (CNN) to build the top level classifier. The second method could improve the classification performance by calculating the average of the word vectors obtained from distributed semantic model. The third method combines lexical and semantic method to extract documents features, which multiplied word term frequency (lexical) with word vector average (semantic). Model build using Calibrated Label Ranking as multi-label classification method, and trained using NaÃ¯ve Bayes algorithm has the best F1-measure of 0.7531. Multiplication of word term frequency and the average of word vectors were also used to build this classifiers. This configuration improved multi-label classification performance by 4.25%, compared to the baseline. The distributed semantic model that gave best performance in this experiment obtained from 300-dimension word2vec of Wikipediaâ€™s articles. The multi-label classification model performance is also influenced by newsâ€™ released date. The difference period between training and testing data would also decrease modelsâ€™ performance.

Keywords

Multi-label classification; Hierarchical multi-label classification; CNN; Word embedding; News

DOI

https://doi.org/10.26555/ijain.v5i1.168

Article metrics

Abstract views : 4775 | PDF views : 457

Cite

How to cite item

Full Text

Download

References

[1] P. Vateekul, M. Kubat, and K. Sarinnapakorn, â€œHierarchical multi-label classification with SVMs: A case study in gene function prediction,â€ Intell. Data Anal., 2014, doi: 10.3233/IDA-140665.

[2] I. Katakis, G. Tsoumakas, and I. Vlahavas, â€œMultilabel Text Classification for Automated Tag Suggestion,â€ Data Manag., 2008, available at: http://www.kde.cs.uni-kassel.de/ws/rsdc08/pdf/9.pdf.

[3] D. D. Lewis, Y. M. Yang, T. G. Rose, and F. Li, â€œRCV1: A new benchmark collection for text categorization research,â€ J. Mach. Learn. Res., 2004, available at: http://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf.

[4] G. Tsoumakas and I. Katakis, â€œMulti-Label Classification,â€ Int. J. Data Warehous. Min., vol. 3, no. 3, pp. 1â€“13, 2007, doi: 10.4018/jdwm.2007070101.

[5] D. Rahmawati and M. L. Khodra, â€œAutomatic multilabel classification for Indonesian news articles,â€ in ICAICTA 2015 - 2015 International Conference on Advanced Informatics: Concepts, Theory and Applications, 2015, doi: 10.1109/ICAICTA.2015.7335382.

[6] J. FÃ¼rnkranz, E. HÃ¼llermeier, E. Loza MencÃa, and K. Brinker, â€œMultilabel classification via calibrated label ranking,â€ Mach. Learn., vol. 73, no. 2, pp. 133â€“153, 2008, doi: 10.1007/s10994-008-5064-8.

[7] M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, â€œLearning multi-label scene classification,â€ Pattern Recognit., vol. 37, no. 9, pp. 1757â€“1771, 2004, doi: 10.1016/j.patcog.2004.03.009.

[8] I. C. Irsan and M. L. Khodra, â€œHierarchical multilabel classification for Indonesian news articles,â€ in 4th IGNITE Conference and 2016 International Conference on Advanced Informatics: Concepts, Theory and Application, ICAICTA 2016, 2016, doi: 10.1109/ICAICTA.2016.7803108.

[9] C. Vens, J. Struyf, L. Schietgat, S. DÅ¾eroski, and H. Blockeel, â€œDecision trees for hierarchical multi-label classification,â€ Mach. Learn., 2008, doi: 10.1007/s10994-008-5077-3.

[10] L. Tenenboim, B. Shapira, and P. Shoval, â€œOntology-based classification of news in an electronic newspaper,â€ Inf. Syst., 2008, available at: http://hdl.handle.net/10525/1035.

[11] S. Dumais and H. Chen, â€œHierarchical classification of Web content,â€ in Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR â€™00, 2000, doi: 10.1145/345508.345593.

[12] M. L. Zhang, J. M. PeÃ±a, and V. Robles, â€œFeature selection for multi-label naive Bayes classification,â€ Inf. Sci. (Ny)., 2009, doi: 10.1016/j.ins.2009.06.010.

[13] N. Stokes and J. Carthy, â€œCombining semantic and syntactic document classifiers to improve first story detection,â€ in Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR â€™01, 2001, doi: 10.1145/383952.384068.

[14] L. Yang, C. Li, Q. Ding, and L. Li, â€œCombining lexical and semantic features for short text classification,â€ in Procedia Computer Science, 2013, doi: 10.1016/j.procs.2013.09.083.

[15] N. Kambhatla, â€œCombining lexical, syntactic, and semantic features with maximum entropy models for extracting relations,â€ in Proceedings of the ACL 2004 on Interactive poster and demonstration sessions -, 2004, doi: 10.3115/1219044.1219066.

[16] Y. Kim, â€œConvolutional Neural Networks for Sentence Classification,â€ in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746â€“1751, 2014, available at: https://www.aclweb.org/anthology/D14-1181.

[17] P. Wang, B. Xu, J. Xu, G. Tian, C. L. Liu, and H. Hao, â€œSemantic expansion using word embedding clustering and convolutional neural network for improving short text classification,â€ Neurocomputing, 2016, doi: 10.1016/j.neucom.2015.09.096.

[18] J. Wang, Z. Wang, D. Zhang, and J. Yan, â€œCombining knowledge with deep convolutional neural networks for short text classification,â€ in IJCAI International Joint Conference on Artificial Intelligence, 2017, doi: 10.24963/ijcai.2017/406.

[19] H. Zhao, Z. Lu, and P. Poupart, â€œEfficient Estimation ofWord Representations in Vector Space,â€ IJCAI Int. Jt. Conf. Artif. Intell., 2015, doi: 10.1162/153244303322533223.

[20] J. Lilleberg, Y. Zhu, and Y. Zhang, â€œSupport Vector Machines and Word2vec for Text Classification with Semantic Features,â€ Proc. IEEE 14th Int. Conf. Cogn. Informatics Cogn. Comput., pp. 136â€“140, 2015, doi: 10.1109/ICCI-CC.2015.7259377.

[21] G. Paltoglou and M. Thelwall, â€œA study of information retrieval weighting schemes for sentiment analysis,â€ in ACL â€™10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguisti, 2010, available at: http://www.aclweb.org/anthology/P10-1141.

[22] F. EnrÃquez, J. A. Troyano, and T. LÃ³pez-Solaz, â€œAn approach to the use of word embeddings in an opinion classification task,â€ Expert Syst. Appl., 2016, doi: 10.1016/j.eswa.2016.09.005.

[23] Tensorflow, â€œVector Representations of Words,â€ Tensorflow, 2016, available at: https://www.tensorflow.org/tutorials/representation/word2vec.

[24] H. Zhang, â€œThe Optimality of Naive Bayes,â€ AAAI, 2004, doi: 10.1016/j.patrec.2005.12.001.

[25] A. Fujino, H. Isozaki, and J. Suzuki, â€œMulti-label Text Categorization with Model Combination based on F1-score Maximization,â€ Proc. IJCNLP, 2008, available at: http://aclweb.org/anthology/I08-2116.

[26] J. Pennington, R. Socher, and C. Manning, â€œGlove: Global Vectors for Word Representation,â€ Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1532â€“1543, 2014, doi: 10.3115/v1/D14-1162.

[27] D. Rahmawati and M. L. Khodra, â€œWord2vec semantic representation in multilabel classification for Indonesian news article,â€ in 4th IGNITE Conference and 2016 International Conference on Advanced Informatics: Concepts, Theory and Application, ICAICTA 2016, 2016, doi: 10.1109/ICAICTA.2016.7803115.

[28] A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, â€œBag of Tricks for Efficient Text Classification,â€ in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, pp. 427â€“431, 2016, available at: https://www.aclweb.org/anthology/E/E17/E17-2068.pdf.

[29] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, â€œEnriching Word Vectors with Subword Information,â€ Trans. Assoc. Comput. Linguist., vol. 5, pp. 135â€“146, 2017, doi: 10.1162/tacl_a_00051.

[30] M. Zhang and Z. Zhou, â€œA Review on Multi-Label Learning Algorithms,â€ IEEE Trans. Knowl. Data Eng., vol. 26, no. 8, pp. 1819â€“1837, 2014, doi: 10.1109/TKDE.2013.39.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571 (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
andri.pranolo.id@ieee.org (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

Username
Password
Remember me