(2) Kamarularifin Abd Jalil
(3) Alya Geogiana Buja
(4) Abdulraqeb Alhammadi
*corresponding author
AbstractClean-label poisoning attacks pose a stealthy and potent threat to deep neural networks (DNNs), particularly when models rely on publicly available or outsourced training data. Among these attacks, the Bullseye Polytope method is highly transferable and can evade state-of-the-art defenses such as deep k-NN. To counter this, we propose Poison Image Traceback via Feature Clustering (PIFC-CLD), a novel forensic approach that leverages Euclidean norm distances to detect and trace clean-label attacks in DNNs. PIFC exploits the geometric consistency of feature representations to identify poisoned samples responsible for model misclassifications. Unlike traditional majority-vote-based defenses, PIFC-CLD performs clustering in feature space and detects poisoned samples based on their proximity to misclassified targets using Euclidean distance. We evaluate our approach under Bullseye Polytope attack scenarios using the CIFAR-10 dataset and WideResNet architectures. PIFC-CLD achieves 99% precision, 95% recall, and a 96% F1 score at k = 25 and ε = 0.2, demonstrating robust performance against Bullseye Polytope attacks. Furthermore, our algorithm exhibits strong resilience to parameter variations while minimizing false positives and preserving model integrity. This work bridges the gap between digital forensics and adversarial machine learning, offering a lightweight, model-agnostic, and interpretable solution for secure model training in adversarial environments.
KeywordsBullseye polytope attacks; Clean-label attacks; Adversarial attacks; Digital forensics; Euclidean norm similarity
|
DOIhttps://doi.org/10.26555/ijain.v12i1.2206 |
Article metricsAbstract views : 245 | PDF views : 51 |
Cite |
Full Text Download
|
References
[1] N. Z. Khalaf, I. I. Al Barazanchi, A. D. Radhi, S. Parihar, P. Shah, and R. Sekhar, “Development of real-time threat detection systems with AI-driven cybersecurity in critical infrastructure,” Mesopotamian J. CyberSecurity, vol. 5, no. 2, pp. 501–513, Jun. 2025. [Online]. Available at: https://journals.mesopotamian.press/index.php/CyberSecurity/article/view/828.
[2] M. R. Subhi, S. Yussof, L. A. B. Burhanuddin, and F. L. Khaleel, “CNNs in Image Forensics: A Systematic Literature Review of Copy-Move, Splicing, Noise Detection, and Data Poisoning Detection Methods,” Mesopotamian J. CyberSecurity, vol. 5, no. 2, pp. 636–656, Jul. 2025. [Online]. Available at: https://mesopotamian.press/journals/index.php/CyberSecurity/article/view/845.
[3] M. Alanezi and R. M. A. AL-Azzawi, “AI-Powered Cyber Threats: A Systematic Review,” Mesopotamian J. CyberSecurity, vol. 4, no. 3, pp. 166–188, Dec. 2024, doi: 10.58496/MJCS/2024/021.
[4] A. Abomakhelb, K. A. Jalil, A. G. Buja, A. Alhammadi, and A. M. Alenezi, “A Comprehensive Review of Adversarial Attacks and Defense Strategies in Deep Neural Networks,” Technologies, vol. 13, no. 5, p. 202, May 2025, doi: 10.3390/technologies13050202.
[5] T. T. Nguyen et al., “Manipulating Recommender Systems: A Survey of Poisoning Attacks and Countermeasures,” ACM Comput. Surv., vol. 57, no. 1, pp. 1–39, Jan. 2025, doi: 10.1145/3677328.
[6] A. Shafahi et al., “Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks,” in 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), 2018, pp. 6103–6113. [Online]. Available at: https://proceedings.neurips.cc/paper_files/paper/2018/hash/.
[7] A. Turner MIT, D. Tsipras MIT, and A. Mądry MIT, “Clean-Label Backdoor Attacks,” Massachusetts Institute of Technology, 2019. [Online]. Available at: https://share.google/dnYZ1kGfkNzvrYezB.
[8] R. Bommasani et al., “On the Opportunities and Risks of Foundation Models,” arxiv Mach. Learn., pp. 1–214, 2021, [Online]. Available at: https://arxiv.org/abs/2108.07258.
[9] B. Zhao and Y. Lao, “CLPA: Clean-Label Poisoning Availability Attacks Using Generative Adversarial Nets,” Proc. AAAI Conf. Artif. Intell., vol. 36, no. 8, pp. 9162–9170, Jun. 2022, doi: 10.1609/aaai.v36i8.20902.
[10] A. Gupta and A. Krishna, “Adversarial Clean Label Backdoor Attacks and Defenses on Text Classification Systems,” in Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023), 2023, pp. 1–12, doi: 10.18653/v1/2023.repl4nlp-1.1.
[11] H. L. Xinyuan, S. Joshi, T. Thebaud, J. Villalba, N. Dehak, and S. Khudanpur, “Clean Label Attacks against SLU Systems,” arxiv Artif. Intell., pp. 1–8, Sep. 2024. [Online]. Available at: https://arxiv.org/pdf/2409.08985v1.
[12] A. Saha, A. Subramanya, and H. Pirsiavash, “Hidden Trigger Backdoor Attacks,” Proc. AAAI Conf. Artif. Intell., vol. 34, no. 07, pp. 11957–11965, Apr. 2020, doi: 10.1609/aaai.v34i07.6871.
[13] G. Severi, J. Meyer, S. Coull, and A. Oprea, “Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers,” in 30th USENIX Security Symposium, 2021, pp. 1487–1504. [Online]. Available: https://www.usenix.org/conference/usenixsecurity21/presentation/severi.
[14] S. Zhao, X. Xu, L. Xiao, J. Wen, and L. A. Tuan, “Clean-label backdoor attack and defense: An examination of language model vulnerability,” Expert Syst. Appl., vol. 265, p. 125856, Mar. 2025, doi: 10.1016/j.eswa.2024.125856.
[15] H. I. Kure, P. Sarkar, A. B. Ndanusa, and A. O. Nwajana, “Detecting and Preventing Data Poisoning Attacks on AI Models,” arxiv Artif. Intell., pp. 4–8, Mar. 2025,. [Online]. Available at: https://arxiv.org/pdf/2503.09302.
[16] M. A. Hanif, N. Chattopadhyay, B. Ouni, and M. Shafique, “Survey on Backdoor Attacks on Deep Learning: Current Trends, Categorization, Applications, Research Challenges, and Future Prospects,” IEEE Access, vol. 13, pp. 93190–93221, 2025, doi: 10.1109/ACCESS.2025.3571995.
[17] Y. Zeng, M. Pan, H. A. Just, L. Lyu, M. Qiu, and R. Jia, “Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Nov. 2023, pp. 771–785, doi: 10.1145/3576915.3616617.
[18] O. Mengara, A. Avila, and T. H. Falk, “Backdoor Attacks to Deep Neural Networks: A Survey of the Literature, Challenges, and Future Research Directions,” IEEE Access, vol. 12, pp. 29004–29023, 2024, doi: 10.1109/ACCESS.2024.3355816.
[19] B. Nelson et al., “Misleading Learners: Co-opting Your Spam Filter,” in Machine Learning in Cyber Trust, Boston, MA: Springer US, 2009, pp. 17–51, doi: 10.1007/978-0-387-88735-7_2.
[20] H. Aghakhani, D. Meng, Y.-X. Wang, C. Kruegel, and G. Vigna, “Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved Transferability,” in 2021 IEEE European Symposium on Security and Privacy (EuroS&P), Sep. 2021, pp. 159–178, doi: 10.1109/EuroSP51992.2021.00021.
[21] C. Zhu et al., “Transferable Clean-Label Poisoning Attacks on Deep Neural Nets,” in Proceedings of Machine Learning Research, May 2019, pp. 7614–7623. [Online]. Available at: https://proceedings.mlr.press/v97/zhu19a.html.
[22] J. Geiping et al., “Witches’ Brew: Industrial Scale Data Poisoning via Gradient Matching,” ICLR 2021 - 9th Int. Conf. Learn. Represent., pp. 1–24, Sep. 2020. [Online]. Available at: https://arxiv.org/pdf/2009.02276.
[23] Y. Gao, C. Xu, D. Wang, S. Chen, D. C. Ranasinghe, and S. Nepal, “STRIP: a defence against trojan attacks on deep neural networks,” in Proceedings of the 35th Annual Computer Security Applications Conference, Dec. 2019, pp. 113–125, doi: 10.1145/3359789.3359790.
[24] N. Peri et al., “Deep k-NN Defense Against Clean-Label Data Poisoning Attacks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12535 LNCS, Springer, Cham, 2020, pp. 55–70, doi: 10.1007/978-3-030-66415-2_4.
[25] S. Hong, N. Carlini, and A. Kurakin, “Diffusion Denoising as a Certified Defense Against Clean-Label Poisoning Attacks,” in ICLR 2024 Conference, 2024, pp. 1–12. [Online]. Available at: https://openreview.net/forum?id=aAE44ivBtx.
[26] K. Huang, Y. Li, B. Wu, Z. Qin, and K. Ren, “Backdoor Defense via Decoupling the Training Process,” ICLR 2022 - 10th Int. Conf. Learn. Represent., pp. 1–25, Feb. 2022. [Online]. Available at: https://arxiv.org/pdf/2202.03423.
[27] M. Zolotukhin, D. Zhang, T. Hämäläinen, and P. Miraghaei, “On Attacking Future 5G Networks with Adversarial Examples: Survey,” Network, vol. 3, no. 1, pp. 39–90, Dec. 2022, doi: 10.3390/network3010003.
[28] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” pp. 1-60, 2009. [Online]. Available: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
[29] B. Chen et al., “Detecting backdoor attacks on deep neural networks by activation clustering,” CEUR Workshop Proc., vol. 2301, pp. 1–8, 2019, [Online]. Available at: https://ceur-ws.org/Vol-2301/paper_18.pdf.
[30] G. Cohen, G. Sapiro, and R. Giryes, “Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp. 14441–14450, doi: 10.1109/CVPR42600.2020.01446.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571 (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
andri.pranolo.id@ieee.org (publication issues)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

























Download