Hierarchical multi-label news article classification with distributed semantic model based features

(1) * Ivana Clairine Irsan Mail (Institut Teknologi Bandung, Indonesia)
(2) Masayu Leylia Khodra Mail (Institut Teknologi Bandung, Indonesia)
*corresponding author


Automatic news categorization is essential to automatically handle the classification of multi-label news articles in online portal. This research employs some potential methods to improve performance of hierarchical multi-label classifier for Indonesian news article. First potential method is using Convolutional Neural Network (CNN) to build the top level classifier. The second method could improve the classification performance by calculating the average of the word vectors obtained from distributed semantic model. The third method combines lexical and semantic method to extract documents features, which multiplied word term frequency (lexical) with word vector average (semantic). Model build using Calibrated Label Ranking as multi-label classification method, and trained using Naïve Bayes algorithm has the best F1-measure of 0.7531. Multiplication of word term frequency and the average of word vectors were also used to build this classifiers. This configuration improved multi-label classification performance by 4.25%, compared to the baseline. The distributed semantic model that gave best performance in this experiment obtained from 300-dimension word2vec of Wikipedia’s articles. The multi-label classification model performance is also influenced by news’ released date. The difference period between training and testing data would also decrease models’ performance.


Multi-label classification; Hierarchical multi-label classification; CNN; Word embedding; News




Article metrics

Abstract views : 3444 | PDF views : 367




Full Text



[1] P. Vateekul, M. Kubat, and K. Sarinnapakorn, “Hierarchical multi-label classification with SVMs: A case study in gene function prediction,” Intell. Data Anal., 2014, doi: 10.3233/IDA-140665.

[2] I. Katakis, G. Tsoumakas, and I. Vlahavas, “Multilabel Text Classification for Automated Tag Suggestion,” Data Manag., 2008, available at: http://www.kde.cs.uni-kassel.de/ws/rsdc08/pdf/9.pdf.

[3] D. D. Lewis, Y. M. Yang, T. G. Rose, and F. Li, “RCV1: A new benchmark collection for text categorization research,” J. Mach. Learn. Res., 2004, available at: http://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf.

[4] G. Tsoumakas and I. Katakis, “Multi-Label Classification,” Int. J. Data Warehous. Min., vol. 3, no. 3, pp. 1–13, 2007, doi: 10.4018/jdwm.2007070101.

[5] D. Rahmawati and M. L. Khodra, “Automatic multilabel classification for Indonesian news articles,” in ICAICTA 2015 - 2015 International Conference on Advanced Informatics: Concepts, Theory and Applications, 2015, doi: 10.1109/ICAICTA.2015.7335382.

[6] J. Fürnkranz, E. Hüllermeier, E. Loza Mencía, and K. Brinker, “Multilabel classification via calibrated label ranking,” Mach. Learn., vol. 73, no. 2, pp. 133–153, 2008, doi: 10.1007/s10994-008-5064-8.

[7] M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, “Learning multi-label scene classification,” Pattern Recognit., vol. 37, no. 9, pp. 1757–1771, 2004, doi: 10.1016/j.patcog.2004.03.009.

[8] I. C. Irsan and M. L. Khodra, “Hierarchical multilabel classification for Indonesian news articles,” in 4th IGNITE Conference and 2016 International Conference on Advanced Informatics: Concepts, Theory and Application, ICAICTA 2016, 2016, doi: 10.1109/ICAICTA.2016.7803108.

[9] C. Vens, J. Struyf, L. Schietgat, S. Džeroski, and H. Blockeel, “Decision trees for hierarchical multi-label classification,” Mach. Learn., 2008, doi: 10.1007/s10994-008-5077-3.

[10] L. Tenenboim, B. Shapira, and P. Shoval, “Ontology-based classification of news in an electronic newspaper,” Inf. Syst., 2008, available at: http://hdl.handle.net/10525/1035.

[11] S. Dumais and H. Chen, “Hierarchical classification of Web content,” in Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’00, 2000, doi: 10.1145/345508.345593.

[12] M. L. Zhang, J. M. Peña, and V. Robles, “Feature selection for multi-label naive Bayes classification,” Inf. Sci. (Ny)., 2009, doi: 10.1016/j.ins.2009.06.010.

[13] N. Stokes and J. Carthy, “Combining semantic and syntactic document classifiers to improve first story detection,” in Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’01, 2001, doi: 10.1145/383952.384068.

[14] L. Yang, C. Li, Q. Ding, and L. Li, “Combining lexical and semantic features for short text classification,” in Procedia Computer Science, 2013, doi: 10.1016/j.procs.2013.09.083.

[15] N. Kambhatla, “Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations,” in Proceedings of the ACL 2004 on Interactive poster and demonstration sessions -, 2004, doi: 10.3115/1219044.1219066.

[16] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751, 2014, available at: https://www.aclweb.org/anthology/D14-1181.

[17] P. Wang, B. Xu, J. Xu, G. Tian, C. L. Liu, and H. Hao, “Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification,” Neurocomputing, 2016, doi: 10.1016/j.neucom.2015.09.096.

[18] J. Wang, Z. Wang, D. Zhang, and J. Yan, “Combining knowledge with deep convolutional neural networks for short text classification,” in IJCAI International Joint Conference on Artificial Intelligence, 2017, doi: 10.24963/ijcai.2017/406.

[19] H. Zhao, Z. Lu, and P. Poupart, “Efficient Estimation ofWord Representations in Vector Space,” IJCAI Int. Jt. Conf. Artif. Intell., 2015, doi: 10.1162/153244303322533223.

[20] J. Lilleberg, Y. Zhu, and Y. Zhang, “Support Vector Machines and Word2vec for Text Classification with Semantic Features,” Proc. IEEE 14th Int. Conf. Cogn. Informatics Cogn. Comput., pp. 136–140, 2015, doi: 10.1109/ICCI-CC.2015.7259377.

[21] G. Paltoglou and M. Thelwall, “A study of information retrieval weighting schemes for sentiment analysis,” in ACL ’10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguisti, 2010, available at: http://www.aclweb.org/anthology/P10-1141.

[22] F. Enríquez, J. A. Troyano, and T. López-Solaz, “An approach to the use of word embeddings in an opinion classification task,” Expert Syst. Appl., 2016, doi: 10.1016/j.eswa.2016.09.005.

[23] Tensorflow, “Vector Representations of Words,” Tensorflow, 2016, available at: https://www.tensorflow.org/tutorials/representation/word2vec.

[24] H. Zhang, “The Optimality of Naive Bayes,” AAAI, 2004, doi: 10.1016/j.patrec.2005.12.001.

[25] A. Fujino, H. Isozaki, and J. Suzuki, “Multi-label Text Categorization with Model Combination based on F1-score Maximization,” Proc. IJCNLP, 2008, available at: http://aclweb.org/anthology/I08-2116.

[26] J. Pennington, R. Socher, and C. Manning, “Glove: Global Vectors for Word Representation,” Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1532–1543, 2014, doi: 10.3115/v1/D14-1162.

[27] D. Rahmawati and M. L. Khodra, “Word2vec semantic representation in multilabel classification for Indonesian news article,” in 4th IGNITE Conference and 2016 International Conference on Advanced Informatics: Concepts, Theory and Application, ICAICTA 2016, 2016, doi: 10.1109/ICAICTA.2016.7803115.

[28] A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, “Bag of Tricks for Efficient Text Classification,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, pp. 427–431, 2016, available at: https://www.aclweb.org/anthology/E/E17/E17-2068.pdf.

[29] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,” Trans. Assoc. Comput. Linguist., vol. 5, pp. 135–146, 2017, doi: 10.1162/tacl_a_00051.

[30] M. Zhang and Z. Zhou, “A Review on Multi-Label Learning Algorithms,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 8, pp. 1819–1837, 2014, doi: 10.1109/TKDE.2013.39.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
   andri.pranolo.id@ieee.org (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0