Enhancement of images compression using channel attention and post-filtering based on deep autoencoder

(1) * Andri Agustav Wirabudi Mail (Department of Intelligence Media Engineering, Hanbat National University School of Applied Science, Korea, Republic of)
(2) Nurwan Reza Fachrurrozi Mail (School of Applied Science, Telkom University, Indonesia)
(3) Pietra Dorand Mail (School of Applied Science, Telkom University, Indonesia)
(4) Muhamad Royhan Mail (School of Applied Science, Telkom University, Indonesia)
*corresponding author

Abstract


Image compression is a crucial research topic in today's information age, especially to meet the demand for balanced data compression efficiency with the quality of the resulting image reconstruction. Common methods used for image compression nowadays are based on autoencoders with deep learning foundations. However, these methods have limitations as they only consider residual values in processed images to achieve existing compression efficiency with less satisfying reconstruction results. To address this issue, we introduce the Attention Block mechanism to improve coding efficiency even further. Additionally, we introduce post-filtering methods to enhance the final reconstruction results of images. Experimental results using two datasets, CLIC for training and KODAK for testing, demonstrate that this method outperforms several previous research methods. With an efficiency coding improvement of -28.16%, an average PSNR improvement of 34%, and an MS-SSIM improvement of 8%, the model in this study significantly enhances the rate-distortion (RD) performance compared to previous approaches.

Keywords


Channel attention; Auto encoder; Post-filtering; Deep learning; Autoendcoder

   

DOI

https://doi.org/10.26555/ijain.v10i3.1499
      

Article metrics

Abstract views : 396 | PDF views : 24

   

Cite

   

Full Text

Download

References


[1] M. A. Rahman and M. Hamada, “Lossless Image Compression Techniques: A State-of-the-Art Survey,” Symmetry (Basel)., vol. 11, no. 10, p. 1274, Oct. 2019, doi: 10.3390/sym11101274.

[2] M. Al-Ani, M. Shaban AL-Ani, and F. Hammadi Awad, “The Jpeg Image Compression Algorithm,” Int. J. Adv. Eng. Technol., vol. 6, no. December, pp. 1055–1062, 2013, [Online]. Available at: https://www.researchgate.net/publication/268523100.

[3] D. S. Taubman and M. W. Marcellin, “JPEG2000: standard for interactive imaging,” Proc. IEEE, vol. 90, no. 8, pp. 1336–1357, Aug. 2002, doi: 10.1109/JPROC.2002.800725.

[4] Z. Jin, M. Z. Iqbal, W. Zou, X. Li, and E. Steinbach, “Dual-Stream Multi-Path Recursive Residual Network for JPEG Image Compression Artifacts Reduction,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 2, pp. 467–479, Feb. 2021, doi: 10.1109/TCSVT.2020.2982174.

[5] W. Cui et al., “Convolutional Neural Networks Based Intra Prediction for HEVC,” in 2017 Data Compression Conference (DCC), Apr. 2017, vol. Part F1277, pp. 436–436, doi: 10.1109/DCC.2017.53.

[6] A. Tawfik et al., “A Generic Real Time Autoencoder-Based Lossy Image Compression,” in 2022 5th International Conference on Communications, Signal Processing, and their Applications (ICCSPA), Dec. 2022, pp. 1–6, doi: 10.1109/ICCSPA55860.2022.10019047.

[7] Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Deep Convolutional AutoEncoder-based Lossy Image Compression,” in 2018 Picture Coding Symposium (PCS), Jun. 2018, pp. 253–257, doi: 10.1109/PCS.2018.8456308.

[8] N. Johnston et al., “Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 4385–4393, doi: 10.1109/CVPR.2018.00461.

[9] J. Ballé, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,” 6th Int. Conf. Learn. Represent. ICLR 2018 - Conf. Track Proc., pp. 1–23, Feb. 2018. [Online]. Available at: https://arxiv.org/abs/1802.01436v2.

[10] Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp. 7936–7945, doi: 10.1109/CVPR42600.2020.00796.

[11] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11211 LNCS, Springer Verlag, 2018, pp. 3–19, doi: 10.1007/978-3-030-01234-2_1.

[12] M. Wang, S. Wan, H. Gong, Y. Yu, and Y. Liu, “An Integrated CNN-based Post Processing Filter For Intra Frame in Versatile Video Coding,” in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Nov. 2019, pp. 1573–1577, doi: 10.1109/APSIPAASC47483.2019.9023240.

[13] A. A. Jeny, M. B. Islam, M. S. Junayed, and D. Das, “Improving Image Compression With Adjacent Attention and Refinement Block,” IEEE Access, vol. 11, pp. 17613–17625, 2023, doi: 10.1109/ACCESS.2022.3195295.

[14] W. Li, W. Sun, Y. Zhao, Z. Yuan, and Y. Liu, “Deep Image Compression with Residual Learning,” Appl. Sci., vol. 10, no. 11, p. 4023, Jun. 2020, doi: 10.3390/app10114023.

[15] B. Bross et al., “Overview of the Versatile Video Coding (VVC) Standard and its Applications,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 10, pp. 3736–3764, Oct. 2021, doi: 10.1109/TCSVT.2021.3101953.

[16] J. Wu, J. Ma, F. Liang, W. Dong, G. Shi, and W. Lin, “End-to-End Blind Image Quality Prediction With Cascaded Deep Neural Network,” IEEE Trans. Image Process., vol. 29, pp. 7414–7426, 2020, doi: 10.1109/TIP.2020.3002478.

[17] X. Zhang and X. Wu, “Ultra High Fidelity Deep Image Decompression With l ∞ -Constrained Compression,” IEEE Trans. Image Process., vol. 30, pp. 963–975, 2021, doi: 10.1109/TIP.2020.3040074.

[18] A. Said, Introduction to Arithmetic Coding - Theory and Practice. p. 1-63, 2023. [Online]. Available at: https://arxiv.org/abs/2302.00819v1.

[19] G. G. Langdon, “An Introduction to Arithmetic Coding,” IBM J. Res. Dev., vol. 28, no. 2, pp. 135–149, Mar. 1984, doi: 10.1147/rd.282.0135.

[20] M. U. Hassan, M. H. Rehmani, and J. Chen, “Huff-DP: Huffman Coding based Differential Privacy Mechanism for Real-Time Data,” arXiv, pp. 1–12, Jan. 2023. [Online]. Available at: https://arxiv.org/abs/2301.10395v1.

[21] L. Liu, T. Chen, H. Liu, S. Pu, L. Wang, and Q. Shen, “2C-Net: integrate image compression and classification via deep neural network,” Multimed. Syst., vol. 29, no. 3, pp. 945–959, Jun. 2023, doi: 10.1007/s00530-022-01026-1.

[22] Y. Wu, Z. Qi, H. Zheng, L. Tao, and W. Gao, “Deep Image Compression with Latent Optimization and Piece-wise Quantization Approximation,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun. 2021, pp. 1926–1930, doi: 10.1109/CVPRW53098.2021.00219.

[23] M. Song, J. Choi, and B. Han, “Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 2021, pp. 2360–2369, doi: 10.1109/ICCV48922.2021.00238.

[24] L. Zhao, H. Bai, A. Wang, and Y. Zhao, “Multiple Description Convolutional Neural Networks for Image Compression,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 8, pp. 2494–2508, Aug. 2019, doi: 10.1109/TCSVT.2018.2867067.

[25] I. Schiopu and A. Munteanu, “Deep-Learning based Lossless Image Coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 7, pp. 1–1, Jul. 2020, doi: 10.1109/TCSVT.2019.2909821.

[26] M. Wang et al., “End-to-end Image Compression with Swin-Transformer,” in 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP), Dec. 2022, pp. 1–5, doi: 10.1109/VCIP56404.2022.10008895.

[27] B. Xu, N. Wang, H. Kong, T. Chen, and M. Li, “Empirical Evaluation of Rectified Activations in Convolutional Network,” arXiv, pp. 1–5, May 2015. [Online]. Available at: https://arxiv.org/abs/1505.00853v2.

[28] Y. Bai, “RELU-Function and Derived Function Review,” SHS Web Conf., vol. 144, p. 02006, Aug. 2022, doi: 10.1051/shsconf/202214402006.

[29] “CLIC · Challenge on Learned Image Compression,” 2024. [Online]. Available at: https://compression.cc/.

[30] V. N. V. Satya Prakash, K. Satya Prasad, and T. Jaya Chandra Prasad, “Color image demosaicing using sparse based radial basis function network,” Alexandria Eng. J., vol. 56, no. 4, pp. 477–483, Dec. 2017, doi: 10.1016/j.aej.2016.08.032.

[31] D. P. Kingma and J. L. Ba, “Adam: A Method for Stochastic Optimization,” 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. International Conference on Learning Representations, ICLR, pp. 1–15, Dec. 22, 2014. [Online]. Available at: https://arxiv.org/abs/1412.6980v9.

[32] G. Zhai and X. Min, “Perceptual image quality assessment: a survey,” Sci. China Inf. Sci., vol. 63, no. 11, p. 211301, Nov. 2020, doi: 10.1007/s11432-019-2757-1.

[33] D. Image and P. Laboratories, Digital Image Processing Laboratory : Image Restoration Minimum Mean Square Error ( MMSE ) Linear Fil- ters, no. 765. pp. 1-4, 2011. [Online]. Available at: https://engineering.purdue.edu/~bouman/grad-labs/Image-Restoration/pdf/lab.pdf.

[34] S. Fraihat and M. A. Al-Betar, “A novel lossy image compression algorithm using multi-models stacked AutoEncoders,” Array, vol. 19, p. 100314, Sep. 2023, doi: 10.1016/j.array.2023.100314.

[35] A. Hore and D. Ziou, “Image Quality Metrics: PSNR vs. SSIM,” in 2010 20th International Conference on Pattern Recognition, Aug. 2010, pp. 2366–2369, doi: 10.1109/ICPR.2010.579.




Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571  (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
   andri.pranolo.id@ieee.org (publication issues)

View IJAIN Stats

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0