Incremental multiclass open-set audio recognition

(1) * Hitham Jleed Mail (University of Ottawa School of Electrical Engineering and Computer Science., Canada)
(2) Martin Bouchard Mail (University of Ottawa School of Electrical Engineering and Computer Science., Canada)
*corresponding author


Incremental learning aims to learn new classes if they emerge while maintaining the performance for previously known classes. It acquires useful information from incoming data to update the existing models. Open-set recognition, however, requires the ability to recognize examples from known classes and reject examples from new/unknown classes. There are two main challenges in this matter. First, new class discovery: the algorithm needs to not only recognize known classes but it must also detect unknown classes. Second, model extension: after the new classes are identified, the model needs to be updated. Focusing on this matter, we introduce incremental open-set multiclass support vector machine algorithms that can classify examples from seen/unseen classes, using incremental learning to increase the current model with new classes without entirely retraining the system. Comprehensive evaluations are carried out on both open set recognition and incremental learning. For open-set recognition, we adopt the openness test that examines the effectiveness of a varying number of known/unknown labels. For incremental learning, we adapt the model to detect a single novel class in each incremental phase and update the model with unknown classes. Experimental results show promising performance for the proposed methods, compared with some representative previous methods.


Incremental Learning Open-Set Recognition Support Vector Machine Audio Recognition



Article metrics

Abstract views : 334 | PDF views : 51




Full Text



[1] D. R. F. Irvine, “Auditory perceptual learning and changes in the conceptualization of auditory cortex,”

Hear. Res., vol. 366, pp. 3–16, Sep. 2018, doi: 10.1016/j.heares.2018.03.011.

[2] Y. Yang et al., “Learning Adaptive Embedding Considering Incremental Class,” IEEE Trans. Knowl. Data Eng., pp. 1–1, 2021, doi: 10.1109/TKDE.2021.3109131.

[3] C. Geng and S. Chen, “Collective Decision for Open Set Recognition,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 1, pp. 192–204, Jan. 2022, doi: 10.1109/TKDE.2020.2978199.

[4] L. P. Jain, W. J. Scheirer, and T. E. Boult, “Multi-class open set recognition using probability of inclusion,” in Computer Vision – ECCV 2014, vol. 8691, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 393–409. doi: 10.1007/978-3-319- 10578-9_26.

[5] H. Jleed and M. Bouchard, “Open set audio recognition for multi-class classification with rejection,”

IEEE Access, vol. 8, pp. 146523–146534, 2020, doi: 10.1109/ACCESS.2020.3015227.

[6] Y. Guo, Z. Zhang, and F. Tang, “Feature selection with kernelized multi-class support vector machine,”

Pattern Recognit., vol. 117, p. 107988, Sep. 2021, doi: 10.1016/j.patcog.2021.107988.

[7] A. Diment, T. Heittola, and T. Virtanen, “Sound event detection for office live and office synthetic AASP challenge,” Proc IEEE AASP Chall. Detect. Classif Acoust Scenes Events WASPAA, 2013, Accessed: Nov. 11, 2016. [Online]. Available:

[8] F. Font, G. Roma, and X. Serra, “Freesound technical demo,” in Proceedings of the 21st ACM international conference on Multimedia, 2013, pp. 411–412, doi: 10.1145/2502081.2502245.

[9] W. J. Scheirer, L. P. Jain, and T. E. Boult, “Probability models for open set recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 11, pp. 2317–2324, Nov. 2014, doi: 10.1109/TPAMI.2014.2321392.

[10] P. R. Mendes Júnior et al., “Nearest neighbors distance ratio open-set classifier,” Mach. Learn., vol. 106, no. 3, pp. 359–386, Mar. 2017, doi: 10.1007/s10994-016-5610-8.

[11] S. Dang, Z. Cao, Z. Cui, Y. Pi, and N. Liu, “Open set incremental learning for automatic target recognition,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 7, pp. 4445–4456, Jul. 2019, doi: 10.1109/TGRS.2019.2891266.

[12] K. Łopatka, J. Kotus, and A. Czyżewski, “Evaluation of sound event detection, classification and localization in the presence of background noise for acoustic surveillance of hazardous situations,” in Multimedia Communications, Services and Security, A. Dziech and A. Czyżewski, Eds. Springer International Publishing, 2014, pp. 96–110. doi: 10.1007/978-3-319-07569-3_8.

[13] A. R. Hilal, A. Sayedelahl, A. Tabibiazar, M. S. Kamel, and O. A. Basir, “A distributed sensor management for large-scale IoT indoor acoustic surveillance,” Future Gener. Comput. Syst., vol. 86, pp. 1170–1184, 2018, doi: 10.1016/j.future.2018.01.020.

[14] R. Kumar and A. Punhani, “Emotion Detection from Audio Using SVM,” in Proceedings of International Conference on Big Data, Machine Learning and their Applications, Singapore, 2021, pp. 257–265. doi: 10.1007/978-981-15-8377-3_22.

[15] W. Huang, S. Lau, T. Tan, L. Li, and L. Wyse, “Audio events classification using hierarchical structure,” in Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia, Dec. 2003, vol. 3, pp. 1299–1303 vol.3. doi: 10.1109/ICICS.2003.1292674.

[16] P. Mahana and G. Singh, “Comparative analysis of machine learning algorithms for audio signals classification,” Int. J. Comput. Sci. Netw. Secur. IJCSNS, vol. 15, no. 6, p. 49, 2015, available at: Google Scholar.

[17] P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio, and M. Vento, “Audio surveillance of roads: a system for detecting anomalous sounds,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 1, pp. 279–288, Jan. 2016, doi: 10.1109/TITS.2015.2470216.

[18] P. R. M. Júnior, T. E. Boult, J. Wainer, and A. Rocha, “Specialized support vector machines for open- set recognition,” ArXiv160603802 Cs Stat, Jun. 2016, Accessed: Dec. 04, 2018. [Online], available:

[19] D. Battaglino, L. Lepauloux, and N. Evans, “The open-set problem in acoustic scene classification,” in 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), Sep. 2016, pp. 1–5, doi: 10.1109/IWAENC.2016.7602939.

[20] Q. Yang, Y. Gu, and D. Wu, “Survey of incremental learning,” in 2019 Chinese Control And Decision Conference (CCDC), Jun. 2019, pp. 399–404, doi: 10.1109/CCDC.2019.8832774.

[21] S. Madhavan and N. Kumar, “Incremental methods in face recognition: a survey,” Artif. Intell. Rev., Aug. 2019, doi: 10.1007/s10462-019-09734-3.

[22] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer, “Online passive-aggressive algorithms,” J. Mach. Learn. Res., vol. 7, no. Mar, pp. 551–585, 2006, available at: Google Scholar.

[23] J. Xu, C. Xu, B. Zou, Y. Y. Tang, J. Peng, and X. You, “New incremental learning algorithm with support vector machines,” IEEE Trans. Syst. Man Cybern. Syst., vol. 49, no. 11, pp. 2230–2241, Nov. 2019, doi: 10.1109/TSMC.2018.2791511.

[24] M. Gutoski, A. E. Lazzaretti, and H. S. Lopes, “Incremental human action recognition with dual memory,” Image Vis. Comput., vol. 116, p. 104313, Dec. 2021, doi: 10.1016/j.imavis.2021.104313.

[25] M. Rahouti, M. Ayyash, S. K. Jagatheesaperumal, and D. Oliveira, “Incremental Learning Implementations and Vision for Cyber Risk Detection in IoT,” IEEE Internet Things Mag., vol. 4, no. 3, pp. 114–119, Sep. 2021, doi: 10.1109/IOTM.0011.2100019.

[26] L. Shu, H. Xu, and B. Liu, “Unseen class discovery in open-world classification,” ArXiv E-Prints, vol. 1801, p. arXiv:1801.05609, Jan. 2018, doi: 10.48550/arXiv.1801.05609.

[27] J. Leo and J. Kalita, “Moving towards open set incremental learning: readily discovering new authors,” ArXiv191012944 Cs Stat, Oct. 2019, Accessed: Aug. 23, 2020. [Online], Available:

[28] T. Diethe and M. Girolami, “Online Learning with (Multiple) Kernels: A Review,” Neural Comput., vol. 25, no. 3, pp. 567–625, Mar. 2013, doi: 10.1162/NECO_a_00406.

[29] S. Abe, Support vector machines for pattern classification, 2nd ed. London ; New York: Springer, 2010, Available at: Google Scholar.

[30] R. Fletcher, Practical methods of optimization. John Wiley & Sons, 2013, Available at: Google Scholar.

[31] K. S. Sahoo et al., “An Evolutionary SVM Model for DDOS Attack Detection in Software Defined Networks,” IEEE Access, vol. 8, pp. 132502–132513, 2020, doi: 10.1109/ACCESS.2020.3009733.

[32] G. Lin, A. Lin, and J. Cao, “Multidimensional KNN algorithm based on EEMD and complexity measures in financial time series forecasting,” Expert Syst. Appl., vol. 168, p. 114443, Apr. 2021, doi: 10.1016/j.eswa.2020.114443.

[33] J. Xin and Y. Qi, Mathematical Modeling and Signal Processing in Speech and Hearing Sciences, vol. 10. Cham: Springer Science & Business Media, 2014, Available at: Google Scholar.

[34] B. McFee et al., “librosa: Audio and music signal analysis in python,” in Proceedings of the 14th python in science conference, 2015, vol. 8, Available at: Google Scholar.

[35] J.-M. Liu et al., “Cough signal recognition with Gammatone Cepstral Coefficients,” in 2013 IEEE China Summit and International Conference on Signal and Information Processing, Jul. 2013, pp. 160-164, doi: 10.1109/ChinaSIP.2013.6625319.

[36] S. Salman, J. Mir, M. T. Farooq, A. N. Malik, and R. Haleemdeen, “Machine Learning Inspired Efficient Audio Drone Detection using Acoustic Features,” in 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Jan. 2021, pp. 335–339, doi: 10.1109/IBCAST51254.2021.9393232.

[37] J. Platt, “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,” Adv. Large Margin Classif., vol. 10, no. 3, pp. 61–74, 1999, Available at: Google Scholar.

[38] A. Mesaros, T. Heittola, and T. Virtanen, “Metrics for Polyphonic Sound Event Detection,” Appl. Sci., vol. 6, no. 6, Art. no. 6, Jun. 2016, doi: 10.3390/app6060162.

[39] K. Zhang, H. Su, and Y. Dou, “Beyond AP: a new evaluation index for multiclass classification task accuracy,” Appl. Intell., vol. 51, no. 10, pp. 7166–7176, Oct. 2021, doi: 10.1007/s10489-021-02223-7.

[40] C. J. Van Rijsbergen, The geometry of information retrieval. Cambridge, UK ; Cambridge University Press, 2004, Available at: Google Scholar.

[41] G. Wohlfahrt, E. Tomelleri, and A. Hammerle, “The urban imprint on plant phenology,” Nat. Ecol. Evol., vol. 3, no. 12, Art. no. 12, Dec. 2019, doi: 10.1038/s41559-019-1017-9.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
E: (paper handling issues) (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0