(2) * Aji Prasetya Wibawa
(3) Anik Nur Handayani
(4) Andrew Nafalski
*corresponding author
AbstractMissing values remain a persistent challenge in time-series data, particularly within large-scale monitoring systems where reliable forecasting and evaluation are essential. Incomplete records often arise from irregular reporting, infrastructure limitations, or system failures, leading to biased analyses and inaccurate predictions. Traditional imputation methods, such as mean, median, and mode substitution, provide computational efficiency but oversimplify temporal structures. At the same time, more advanced approaches, including Multiple Imputation by Chained Equations (MICE) and K-Nearest Neighbors (KNN), offer improvements yet remain sensitive to data distribution and model configuration. To address this gap, this study introduces Sherwood Duel Optimization (SDO). This socio-inspired framework reconceptualizes imputation as a deterministic duel-based optimization problem. In its fixed form, SDO generates multiple candidate imputations and selects the most robust replacement value using a composite multi-metric scoring mechanism that integrates forecasting accuracy and explanatory power. The framework was evaluated using multivariate educational time-series data and further validated across heterogeneous SDG-related domains, and compared against classical and advanced baselines across three forecasting models. Experimental results demonstrate that SDO consistently outperforms existing methods, reducing forecasting error (MAPE) by more than 40%, achieving the lowest RMSE, and producing R² values exceeding 0.95. Statistical testing confirms that these improvements are significant across experimental configurations. These findings highlight the potential of SDO as a reliable, interpretable, and computationally efficient optimization-based imputation framework. By strengthening data reliability at the reconstruction stage, SDO enhances the credibility of downstream forecasting and decision-making in institutional and sustainability-oriented monitoring systems.
KeywordsMissing Data; Imputation; Time-Series Forecasting; Static Sherwood Duel Optimization; Socio-Inspired Algorithm
|
DOIhttps://doi.org/10.26555/ijain.v12i1.2396 |
Article metricsAbstract views : 210 | PDF views : 33 |
Cite |
Full Text Download
|
References
[1] A. Abisoye and J. I. Akerele, “A High-Impact Data-Driven Decision-Making Model for Integrating Cutting-Edge Cybersecurity Strategies into Public Policy, Governance, and Organizational Frameworks,” Int. J. Multidiscip. Res. Growth Eval., vol. 2, no. 1, pp. 623–637, 2021, doi: 10.54660/.IJMRGE.2021.2.1.623-637.
[2] N. Bachmann, S. Tripathi, M. Brunner, and H. Jodlbauer, “The Contribution of Data-Driven Technologies in Achieving the Sustainable Development Goals,” Sustainability, vol. 14, no. 5, p. 2497, Feb. 2022, doi: 10.3390/su14052497.
[3] Albert Gomes, Nishat Margia Islam, and Md Rashidul Karim, “Data-Driven Environmental Risk Management and Sustainability Analytics (Second Edition),” J. Comput. Sci. Technol. Stud., vol. 7, no. 3, pp. 812–825, May 2025, doi: 10.32996/jcsts.2025.7.3.89.
[4] A. B. P. Utama, S. Patmanthara, A. P. Wibawa, and G. Kurubacak, “Forecasting learning in electrical engineering and informatics: An ontological approach,” International Journal of Education and Learning, vol. 25, no. 3, pp. 185-196, Dec. 2023, doi: 10.31763/ijele.v5i3.1227.
[5] W. Ben Gunawan, “Revisiting the Sustainable Development Goal 4 ‘Quality Education’: Insights, Prospects, and Recommendations,” SAKAGURU: Journal of Pedagogy and Creative Teacher, vol. 2, no. 1, p. 12-36, May. 2025, doi: 10.70211/sakaguru.v2i1.202.
[6] M. Alabadla et al., “Systematic Review of Using Machine Learning in Imputing Missing Values,” IEEE Access, vol. 10, pp. 44483–44502, 2022, doi: 10.1109/ACCESS.2022.3160841.
[7] S. M. Piryonesi and T. E. El-Diraby, “Data Analytics in Asset Management: Cost-Effective Prediction of the Pavement Condition Index,” J. Infrastruct. Syst., vol. 26, no. 1, p. 04019036, Mar. 2020, doi: 10.1061/(ASCE)IS.1943-555X.0000512.
[8] F. Kong, Z. Song, and Q. Liu, “The frontiers of intelligent health services: cardiovascular disease prediction using novel machine learning methods and metaheuristic algorithm,” Comput. Methods Biomech. Biomed. Engin., pp. 1–19, May 2025, doi: 10.1080/10255842.2025.2502823.
[9] H. Hewamalage, K. Ackermann, and C. Bergmeir, “Forecast evaluation for data scientists: common pitfalls and best practices,” Data Min. Knowl. Discov., vol. 37, no. 2, pp. 788–832, Mar. 2023, doi: 10.1007/s10618-022-00894-5.
[10] A. S. Tejani, Y. S. Ng, Y. Xi, and J. C. Rayan, “Understanding and Mitigating Bias in Imaging Artificial Intelligence,” RadioGraphics, vol. 44, no. 5, p. 13, May 2024, doi: 10.1148/rg.230067.
[11] T. T. Khoei and A. Singh, “Data reduction in big data: a survey of methods, challenges and future directions,” Int. J. Data Sci. Anal., vol. 20, no. 3, pp. 1643–1682, Sep. 2025, doi: 10.1007/s41060-024-00603-z.
[12] S. N. P. Sreeramana Aithal, Shubhrajyotsna Aithal, “Future of Higher Education through Technology Prediction and Forecasting,” ResearchGate. Mar. 02, 2026, doi: 10.5281/zenodo.11903348.
[13] A. A. Wani and F. Abeer, “Application of machine learning techniques for warfarin dosage prediction: a case study on the MIMIC-III dataset,” PeerJ Comput. Sci., vol. 11, p. e2612, Jan. 2025, doi: 10.7717/peerj-cs.2612.
[14] M. Afkanpour, D. Tehrany Dehkordy, M. Momeni, and H. Tabesh, “Conceptual framework as a guide to choose appropriate imputation method for missing values in a clinical structured dataset,” BMC Med. Res. Methodol., vol. 25, no. 1, p. 43, Feb. 2025, doi: 10.1186/s12874-025-02496-3.
[15] D. Adhikari et al., “A Comprehensive Survey on Imputation of Missing Data in Internet of Things,” ACM Comput. Surv., vol. 55, no. 7, pp. 1–38, Jul. 2023, doi: 10.1145/3533381.
[16] I. D. Mienye and T. G. Swart, “A Comprehensive Review of Deep Learning: Architectures, Recent Advances, and Applications,” Information, vol. 15, no. 12, p. 755, Nov. 2024, doi: 10.3390/info15120755.
[17] M. Y. Shakor and M. Ibrahim Khaleel, “Modern Deep Learning Techniques for Big Medical Data Processing in Cloud,” IEEE Access, vol. 13, pp. 62005–62028, 2025, doi: 10.1109/ACCESS.2025.3556327.
[18] S. M. Alhammad, M. M. Eid, E. A. Mattar, and E.-S. M. El-Kenawy, “Optimization-Driven Learning for Leakage-Controlled Geospatial Modeling of Antenna Structure Registration Data,” IEEE Access, vol. 14, pp. 15273–15310, 2026, doi: 10.1109/ACCESS.2026.3657224.
[19] H. Lee, D. Kim, H. Cho, G. Song, and J. Yoon, “Evaluation of data imputation models for building-integrated photovoltaic systems with practical performance and reproducibility,” Sol. Energy, vol. 308, no. April, p. 114428, Apr. 2026, doi: 10.1016/j.solener.2026.114428.
[20] S. Zahmatkesh and P. Zech, “Spatio-Temporal Missing Data Imputation: A Systematic Literature Review with a Focus on Statistical and Machine Learning-Based Approaches,” ACM Comput. Surv., vol. 0, p. 37, Feb. 2026, doi: 10.1145/3797903.
[21] F. Wunderlich et al., “Assessing machine learning and data imputation approaches to handle the issue of data sparsity in sports forecasting,” Mach. Learn., vol. 114, no. 2, p. 48, Feb. 2025, doi: 10.1007/s10994-024-06651-7.
[22] A. Vehtari, A. Gelman, D. Simpson, B. Carpenter, and P.-C. Bürkner, “Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion),” Bayesian Anal., vol. 16, no. 2, pp. 667–718, Jun. 2021, doi: 10.1214/20-BA1221.
[23] H. Anahideh, P. Haghighat, N. Nezami, and D. G`andara, “Auditing the Imputation Effect on Fairness of Predictive Analytics in Higher Education,” in Computers and Society (cs.CY), Dec. 2022, p. 48. doi: 10.48550/arXiv.2109.07908.
[24] J. Koehler and C. Kuenzer, “Forecasting Spatio-Temporal Dynamics on the Land Surface Using Earth Observation Data—A Review,” Remote Sens., vol. 12, no. 21, p. 3513, Oct. 2020, doi: 10.3390/rs12213513.
[25] M. Dhilsath Fathima, R. Hariharan, and S. P. Raja, “Multiple Imputation by Chained Equations– K -Nearest Neighbors and Deep Neural Network Architecture for Kidney Disease Prediction,” Int. J. Image Graph., vol. 23, no. 02, Mar. 2023, doi: 10.1142/S0219467823500146.
[26] S. van Buuren and K. Groothuis-Oudshoorn, “mice : Multivariate Imputation by Chained Equations in R,” J. Stat. Softw., vol. 45, no. 3, p. 67, 2011, doi: 10.18637/jss.v045.i03.
[27] M. Abdelsattar, M. A. Azim, A. AbdelMoety, and A. Emad-Eldeen, “Comparative analysis of deep learning architectures in solar power prediction,” Sci. Rep., vol. 15, no. 1, p. 31729, Aug. 2025, doi: 10.1038/s41598-025-14908-x.
[28] M. Zhang, R. Zhao, C. Wang, L. Jing, and D. Li, “Real-Time Imputation Model for Missing Sensor Data Based on Alternating Attention Mechanism,” IEEE Sens. J., vol. 25, no. 5, pp. 8962–8974, Mar. 2025, doi: 10.1109/JSEN.2024.3519370.
[29] S. Dhanka, A. Sharma, A. Kumar, S. Maini, and H. Vundavilli, “Advancements in Hybrid Machine Learning Models for Biomedical Disease Classification Using Integration of Hyperparameter-Tuning and Feature Selection Methodologies: A Comprehensive Review,” Arch. Comput. Methods Eng., vol. 33, no. 1, pp. 289–324, Jan. 2026, doi: 10.1007/s11831-025-10309-5.
[30] P. C. Chiu, A. Selamat, O. Krejcar, K. K. Kuok, S. D. A. Bujang, and H. Fujita, “Missing Value Imputation Designs and Methods of Nature-Inspired Metaheuristic Techniques: A Systematic Review,” IEEE Access, vol. 10, pp. 61544–61566, 2022, doi: 10.1109/ACCESS.2022.3172319.
[31] S. Batool, J. Rashid, M. W. Nisar, J. Kim, H.-Y. Kwon, and A. Hussain, “Educational data mining to predict students’ academic performance: A survey study,” Educ. Inf. Technol., vol. 28, no. 1, pp. 905–971, Jan. 2023, doi: 10.1007/s10639-022-11152-y.
[32] M. Afkanpour, E. Hosseinzadeh, and H. Tabesh, “Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review,” BMC Med. Res. Methodol., vol. 24, no. 1, p. 188, Aug. 2024, doi: 10.1186/s12874-024-02310-6.
[33] A. P. Wibawa, A. B. P. Utama, H. Elmunsyah, U. Pujianto, F. A. Dwiyanto, and L. Hernandez, “Time-series analysis with smoothed Convolutional Neural Network,” J. Big Data, vol. 9, no. 1, p. 44, Dec. 2022, doi: 10.1186/s40537-022-00599-y.
[34] A. P. Wibawa, “Mean-Median Smoothing Backpropagation Neural Network to Forecast Unique Visitors Time Series of Electronic Journal,” Journal of Applied Data Sciences, vol. 4, no. 3, pp. 163-174, Sep. 2023, doi: 10.47738/jads.v4i3.97.
[35] J. Yang, Y. Wang, Y. Yang, K. Ding, C. Na, and Y. Yang, “Effects of single and multiple imputation strategies on addressing over-fitting issues caused by imbalanced data from various scenarios,” Appl. Intell., vol. 54, no. 3, pp. 2812–2830, Feb. 2024, doi: 10.1007/s10489-024-05295-3.
[36] G. S. Ramnath, R. Harikrishnan, S. M. Muyeen, A. Kukker, S. D. Pohekar, and K. Kotecha, “A peer-and self-group competitive behavior-based socio-inspired approach for household electricity conservation,” Sci. Rep., vol. 14, no. 1, p. 17245, Jul. 2024, doi: 10.1038/s41598-024-56926-1.
[37] A. P. Wibawa et al., “Deep Learning Approaches with Optimum Alpha for Energy Usage Forecasting,” Knowledge Engineering and Data Science, vol. 6, no. 2, p. 5, Oct. 2023, doi: 10.17977/um018v6i22023p170-187.
[38] A. W. Saputra, A. P. Wibawa, U. Pujianto, A. B. Putra Utama, and A. Nafalski, “LSTM-based Multivariate Time-Series Analysis: A Case of Journal Visitors Forecasting,” ILKOM Jurnal Ilmiah, vol. 14, no. 1, pp. 57-62, Apr. 2022, doi: 10.33096/ilkom.v14i1.1106.57-62.
[39] A. B. Putra Utama, A. P. Wibawa, M. Muladi, and A. Nafalski, “PSO based Hyperparameter tuning of CNN Multivariate Time- Series Analysis,” Jurnal Online Informatika, vol. 7, no. 2, pp. 193-202, 2022, doi: 10.15575/join.v7i2.858.
[40] W. Zhou, Z. Yan, and L. Zhang, “A comparative study of 11 non-linear regression models highlighting autoencoder, DBN, and SVR, enhanced by SHAP importance analysis in soybean branching prediction,” Sci. Rep., vol. 14, no. 1, p. 5905, Mar. 2024, doi: 10.1038/s41598-024-55243-x.
[41] A. Pak, A. K. Rad, M. J. Nematollahi, and M. Mahmoudi, “Application of the Lasso regularisation technique in mitigating overfitting in air quality prediction models,” Sci. Rep., vol. 15, no. 1, p. 547, Jan. 2025, doi: 10.1038/s41598-024-84342-y.
[42] P. N. Sharma, G. Shmueli, M. Sarstedt, N. Danks, and S. Ray, “Prediction-Oriented Model Selection in Partial Least Squares Path Modeling,” Decis. Sci., vol. 52, no. 3, pp. 567–607, Jun. 2021, doi: 10.1111/deci.12329.
[43] E. Afrifa‐Yamoah, U. A. Mueller, S. M. Taylor, and A. J. Fisher, “Missing data imputation of high‐resolution temporal climate time series data,” Meteorol. Appl., vol. 27, no. 1, p. e1873, Jan. 2020, doi: 10.1002/met.1873.
[44] J. S. Prince, I. Charest, J. W. Kurzawski, J. A. Pyles, M. J. Tarr, and K. N. Kay, “Improving the accuracy of single-trial fMRI response estimates using GLMsingle,” Elife, vol. 11, p. 28, Nov. 2022, doi: 10.7554/eLife.77599.
[45] J. S. Joswig et al., “Imputing missing data in plant traits: A guide to improve gap‐filling,” Glob. Ecol. Biogeogr., vol. 32, no. 8, pp. 1395–1408, Aug. 2023, doi: 10.1111/geb.13695.
[46] J. Zhu, X. Zhao, Y. Sun, S. Song, and X. Yuan, “Relational Data Cleaning Meets Artificial Intelligence: A Survey,” Data Sci. Eng., vol. 10, no. 2, pp. 147–174, Jun. 2025, doi: 10.1007/s41019-024-00266-7.
[47] H. Karnati, A. Soma, A. Alam, and B. Kalaavathi, “Comprehensive analysis of various imputation and forecasting models for predicting PM2.5 pollutant in Delhi,” Neural Comput. Appl., vol. 37, no. 17, pp. 11441–11458, Jun. 2025, doi: 10.1007/s00521-025-11047-2.
[48] V. V. Golovko, “Robust Method for Confidence Interval Estimation in Outlier-Prone Datasets: Application to Molecular and Biophysical Data,” Biomolecules, vol. 15, no. 5, p. 704, May 2025, doi: 10.3390/biom15050704.
[49] Y. S. Mohammed, H. Abdelkader, P. Pławiak, and M. Hammad, “A novel model to optimize multiple imputation algorithm for missing data using evolution methods,” Biomed. Signal Process. Control, vol. 76, no. July, p. 103661, Jul. 2022, doi: 10.1016/j.bspc.2022.103661.
[50] S. Mpofu and D. Chasokela, “Data-Informed Decision-Making,” IGI Global Scientific Publishing, Nov. 2024, pp. 103–138. doi: 10.4018/979-8-3693-6967-8.ch004.
[51] I. M. Wirawan, A. P. Wibawa, and T. Widiyanintyas, “Photovoltaic Energy Anomaly Detection using Transformer Based Machine Learning,” International Journal of Robotics and Control Systems, vol. 4, no. 3, pp. 1337-1352, Aug. 2024, doi: 10.31763/ijrcs.v4i3.1260.
[52] D. V. Ogunkan and S. K. Ogunkan, “Exploring big data applications in sustainable urban infrastructure: A review,” Urban Gov., vol. 5, no. 1, pp. 54–68, Mar. 2025, doi: 10.1016/j.ugj.2025.02.003.
[53] X. Jiang, Y. Yao, S. Liu, F. Shen, L. Nie, and X.-S. Hua, “Dual Dynamic Threshold Adjustment Strategy,” ACM Trans. Multimed. Comput. Commun. Appl., vol. 20, no. 7, pp. 1–18, Jul. 2024, doi: 10.1145/3656047.
[54] B. I. Chigbu and S. L. Makapela, “Data-Driven Leadership in Higher Education: Advancing Sustainable Development Goals and Inclusive Transformation,” Sustainability, vol. 17, no. 7, p. 3116, Apr. 2025, doi: 10.3390/su17073116.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571 (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
andri.pranolo.id@ieee.org (publication issues)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

























Download