A data mining approach for classification of traffic violations types

(1) Nor Aqilah Othman Mail (Faculty of Computer Science & Information Technology, Universiti Tun Hussein Onn Malaysia, Malaysia)
(2) Cik Feresa Mohd Foozy Mail (Faculty of Computer Science & Information Technology, Universiti Tun Hussein Onn Malaysia, Malaysia)
(3) Aida Mustapha Mail (Faculty of Computer Science & Information Technology, Universiti Tun Hussein Onn Malaysia, Malaysia)
(4) * Salama A Mostafa Mail (Faculty of Computer Science & Information Technology, Universiti Tun Hussein Onn Malaysia, Malaysia)
(5) Shamala Palaniappan Mail (Faculty Science Computer and Mathematics, Universiti Teknologi MARA (UiTM), Malaysia)
(6) Shafiza Ariffin Kashinath Mail (Sena Traffic Systems Sdn. Bhd, Kuala Lumpur, Malaysia)
*corresponding author


Traffic summons, also known as traffic tickets, is a notice issued by a law enforcement official to a motorist, who is a person who drives a car, lorry, or bus, and a person who rides a motorcycle. This study is set to perform a comparative experiment to compare the performance of three classification algorithms (Naive Bayes, Gradient Boosted Trees, and Deep Learning algorithm) in classifying the traffic violation types. The performance of all the three classification models developed in this work is measured and compared. The results show that the Gradient Boosted Trees and Deep Learning algorithm have the best value in accuracy and recall but low precision. Naïve Bayes, on the other hand, has high recall since it is a picky classifier that only performs well in a dataset that is high in precision. This paper’s results could serve as baseline results for investigations related to the classification of traffic violation types. It is also helpful for authorities to strategize and plan ways to reduce traffic violations among road users by studying the most common traffic violation types in an area, whether a citation, a warning, or an ESERO (Electronic Safety Equipment Repair Order).




Article metrics

Abstract views : 320 | PDF views : 62




Full Text



[1] F. Kamanga, V. Smercina, B. G. Brents, D. Okamura, and V. Fuentes, “Costs and Consequences of Traffic Fines and Fees: A Case Study of Open Warrants in Las Vegas, Nevada,” Soc. Sci., vol. 10, no. 11, p. 440, Nov. 2021, doi: 10.3390/socsci10110440.

[2] A. J. Khattak, N. Ahmad, B. Wali, and E. Dumbaugh, “A taxonomy of driving errors and violations: Evidence from the naturalistic driving study,” Accid. Anal. Prev., vol. 151, p. 105873, Mar. 2021, doi: 10.1016/j.aap.2020.105873.

[3] N. A. S. Zaidi, A. Mustapha, S. A. Mostafa, and M. N. Razali, “A Classification Approach for Crime Prediction,” Khalaf M., Al-Jumeily D., Lisitsa A. Appl. Comput. to Support Ind. Innov. Technol. ACRIT 2019. Commun. Comput. Inf. Sci. vol 1174. Springer, Cham., pp. 68–78, 2020, doi: 10.1007/978-3-030-38752-5_6.

[4] R. Factor, “An empirical analysis of the characteristics of drivers who are ticketed for traffic offences,” Transp. Res. Part F Traffic Psychol. Behav., vol. 53, pp. 1–13, Feb. 2018, doi: 10.1016/j.trf.2017.12.001.

[5] B. Jiang et al., “Transport and public health in China: the road to a healthy future,” Lancet, vol. 390, no. 10104, pp. 1781–1791, Oct. 2017, doi: 10.1016/S0140-6736(17)31958-X.

[6] A. M. Pérez-Marín and M. Guillen, “Semi-autonomous vehicles: Usage-based data evidences of what could be expected from eliminating speed limit violations,” Accid. Anal. Prev., vol. 123, pp. 99–106, Feb. 2019, doi: 10.1016/j.aap.2018.11.005.

[7] S. Thapa and J. Lee, “Data Mining Techniques on Traffic Violations,” Dep. Electr. Comput. Eng. Univ. Bridg. CT, 2016. Available: Google Scholar.

[8] X. Guo, “Traffic Flow Forecasting Model Based on Data Mining,” Proc. 2016 Int. Conf. Educ. Manag. Comput. Soc., pp. 1043–1046, 2016, doi: 10.2991/emcs-16.2016.257.

[9] R. Factor, “Reducing traffic violations in minority localities: Designing a traffic enforcement program through a public participation process,” Accid. Anal. Prev., vol. 121, pp. 71–81, Dec. 2018, doi: 10.1016/j.aap.2018.09.005.

[10] N. Boyko, P. Mykhailyshyn, and Y. Kryvenchuk, “Use a cluster approach to organize and analyze data inside the cloud,” ECONTECHMOD An Int. Q. J. Econ. Technol. Model. Process., vol. 7, 2018. Available: Google Scholar.

[11] J. R. Ingram, “The Effect of Neighborhood Characteristics on Traffic Citation Practices of the Police,” Police Q., vol. 10, no. 4, pp. 371–393, Dec. 2007, doi: 10.1177/1098611107306995.

[12] K. S. Hlaing and Y. M. K. K. Thaw, “Applications, Techniques and Trends of Data Mining and Knowledge Discovery Database,” Int. J. Trend Sci. Res. Dev., vol. 3, no. 5, pp. 1604–1606, 2019, [Online]. Available: https://www.ijtsrd.com/papers/ijtsrd26733.pdf.

[13] A. Azevedo, “Data Mining and Knowledge Discovery in Databases,” Adv. Methodol. Technol. Netw. Archit. Mob. Comput. Data Anal., pp. 502–514, 2019, doi: 10.4018/978-1-5225-7598-6.ch037.

[14] M. A. O’Reilly, W. Johnston, C. Buckley, D. Whelan, and B. Caulfield, “The influence of feature selection methods on exercise classification with inertial measurement units,” 2017 IEEE 14th Int. Conf. Wearable Implant. Body Sens. Networks, pp. 193–196, May 2017, doi: 10.1109/BSN.2017.7936039.

[15] J. Li et al., “Feature Selection,” ACM Comput. Surv., vol. 50, no. 6, pp. 1–45, Jan. 2018, doi: 10.1145/3136625.

[16] X. Chu, I. F. Ilyas, S. Krishnan, and J. Wang, “Data Cleaning,” Proc. 2016 Int. Conf. Manag. Data, pp. 2201–2206, Jun. 2016, doi: 10.1145/2882903.2912574.

[17] V. Kunwar, K. Chandel, A. S. Sabitha, and A. Bansal, “Chronic Kidney Disease analysis using data mining classification techniques,” 2016 6th Int. Conf. - Cloud Syst. Big Data Eng., pp. 300–305, Jan. 2016, doi: 10.1109/CONFLUENCE.2016.7508132.

[18] D. Leslie, “Understanding Artificial Intelligence Ethics and Safety: A Guide for the Responsible Design and Implementation of AI Systems in the Public Sector,” SSRN Electron. J., 2019, doi: 10.2139/ssrn.3403301.

[19] A. Tiron-Tudor and D. Deliu, “Big Data’s Disruptive Effect on Job Profiles: Management Accountants’ Case Study,” J. Risk Financ. Manag., vol. 14, no. 8, p. 376, Aug. 2021, doi: 10.3390/jrfm14080376.

[20] A. Fatima, N. Nazir, and M. G. Khan, “Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools,” Int. J. Inf. Technol. Comput. Sci., vol. 9, no. 3, pp. 50–61, Mar. 2017, doi: 10.5815/ijitcs.2017.03.06.

[21] O. Adeniji, “Business to consumers (B2C): the effect of machine learning application in telecom customer churn management,” Dublin Business School, 2020. Available: Google Scholar.

[22] A. S. Gran, “Automatic machine learning applied to time series forecasting for novice users in small to medium-sized businesses: a review of how companies accumulate and use data along with an interface for data preparation as well as easy and powerful prediction analysis capable of providing valuable insight,” 2019. Available: Google Scholar.

[23] T. Hastie, J. Friedman, and R. Tibshirani, “Model Assessment and Selection,” Elem. Stat. Learn. Springer Ser. Stat. Springer, New York, NY., pp. 193–224, 2001, doi: 10.1007/978-0-387-21606-5_7.

[24] K. Lan, D. Wang, S. Fong, L. Liu, K. K. L. Wong, and N. Dey, “A Survey of Data Mining and Deep Learning in Bioinformatics,” J. Med. Syst., vol. 42, no. 8, p. 139, Aug. 2018, doi: 10.1007/s10916-018-1003-9.

[25] P. Gaur, “Neural networks in data mining,” Int. J. Electron. Comput. Sci. Eng., vol. 1, no. 3, pp. 1449-1453, 2012. Available: Google Scholar.

[26] P. S. Patel and S. Desai, “A comparative study on data mining tools,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 4, no. 2, 2015. Available: Google Scholar.

[27] J. Santos-Pereira, L. Gruenwald, and J. Bernardino, “Top data mining tools for the healthcare industry,” 2021, doi: 10.1016/j.jksuci.2021.06.002.

[28] A. Benussi et al., “Classification accuracy of TMS for the diagnosis of mild cognitive impairment,” Brain Stimul., 2021. doi: 10.1016/j.brs.2021.01.004.

[29] S. N. M. M. Nafi, A. Mustapha, S. A. Mostafa, S. H. Khaleefah, and M. N. Razali, “Experimenting Two Machine Learning Methods in Classifying River Water Quality,” Khalaf M., Al-Jumeily D., Lisitsa A. Appl. Comput. to Support Ind. Innov. Technol. ACRIT 2019. Commun. Comput. Inf. Sci. vol 1174. Springer, Cham., pp. 213–222, 2020, doi: 10.1007/978-3-030-38752-5_17.

[30] S. Saifullah, Y. Fauziyah, and A. S. Aribowo, “Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data,” J. Inform., vol. 15, no. 1, p. 45, Feb. 2021, doi: 10.26555/jifo.v15i1.a20111.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by Informatics Department - Universitas Ahmad Dahlan, and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: ijain@uad.ac.id (paper handling issues)
    info@ijain.org, andri.pranolo.id@ieee.org (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0