Deep Generative AI Based on Denoising Diffusion Probabilistic Models for Applications in Image Processing

Emili Silva Bezerra; Quefren Oliveira Leher; Uendel Diego da Silva Alves; Thuanne Paixão; Ana Beatriz Alvarez

doi:10.26439/interfases2024.n020.7389

Authors

Emili Silva Bezerra PAVIC Laboratory, University of Acre (UFAC), Brazil https://orcid.org/0000-0003-4519-8332
Quefren Oliveira Leher PAVIC Laboratory, University of Acre (UFAC), Brazil https://orcid.org/0009-0005-6678-1131
Uendel Diego da Silva Alves PAVIC Laboratory, University of Acre (UFAC), Brazil https://orcid.org/0009-0009-3357-4979
Thuanne Paixão PAVIC Laboratory, University of Acre (UFAC), Brazil https://orcid.org/0000-0002-5563-8971
Ana Beatriz Alvarez PAVIC Laboratory, University of Acre (UFAC), Brazil https://orcid.org/0000-0003-3403-8261

DOI:

https://doi.org/10.26439/interfases2024.n020.7389

Keywords:

machine learning, reconstruction, segmentation, face and gesture recognition, remote sensing

Abstract

Denoising diffusion probabilistic models (DDPMs) have demonstrated significant potential in addressing complex image processing challenges. This paper explores the application of DDPMs in three different areas: reconstruction of remote sensing imagery affected by cloud cover, reconstruction of facial images with occluded areas, and segmentation of bodies of water from remote sensing imagery. Inpainting involves filling in missing regions in images, while DDPMs act as data generators capable of synthesizing information that alings coherently with the context of the original data. Inspired by the inpainting technique, the RePaint approach was adapted and applied to reconstruction tasks. The WaterSegDiff approach, which uses a diffusion model as a backbone, was employed for the segmentation task. To illustrate the model’s behavior and provide examples of the tasks, experiments were carried out with both qualitative and quantitative evaluations. The qualitative results show the model’s ability to generate data for reconstruction and segmentation. Quantitatively, metrics such as MSE, PSNR, SSIM, IoU, PA and F1 score highlight the model’s proficient performance in image processing tasks. In this scenario, DDPMs have proved to be a promising tool for high-quality data reconstruction, enabling the hallucination of image regions with high visual coherence and facilitating applications in various areas, such as environmental monitoring, facial recognition, water resource mapping, among others.

Downloads

Download data is not yet available.

References

Adrian, R., O’Reilly, C. M., Zagarese, H., Baines, S. B., Hessen, D. O., Keller, W., David M. Livingstone, D. M., Sommaruga, R., Straile, D., Van Donk, E., Weyhenmeyer, G. A., & Winder, M. (2009). Lakes as sentinels of climate change. Limnology and oceanography, 54(6 part 2), 2283-2297. https://doi.org/10.4319/lo.2009.54.6_part_2.2283

Ali, W., Tian, W., Din, S. U., Iradukunda, D., & Khan, A. A. (2021). Classical and modern face recognition approaches: a complete review. Multimedia tools and applications, 80, 4825-4880. https://doi.org/10.1007/s11042-020-09850-1

Alves, U. D. S. (2024). Reconstrução de áreas ausentes em imagens faciais usando a técnica de Inpainting baseada em modelo de difusão. [Master’s thesis, Universidade Federal do Acre], UFAC.

Amit, T., Shaharbany, T., Nachmani, E., & Wolf, L. (2021). Segdiff: Image segmentation with diffusion probabilistic models. ArXiv. https://doi.org/10.48550/arXiv.2112.00390

Avrahami, O., Lischinski, D., & Fried, O. (2022). Blended diffusion for text-driven editing of natural images. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orlean, LA, USA, 18208 – 18218. https://doi.org/10.48550/arXiv.2111.14818

Awais, M., Li, W., Hussain, S., Cheema, M. J. M., Li, W., Song, R., & Liu, C. (2022). Comparative evaluation of land surface temperature images from unmanned aerial vehicle and satellite observation for agricultural areas using in situ data. Agriculture, 12(2), 184. https://doi.org/10.3390/agriculture12020184

Ayala, C., Sesma, R., Aranda, C., & Galar, M. (2023). Diffusion models for remote sensing imagery semantic segmentation. IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA (5654 – 5657). https://doi.org/10.1109/IGARSS52108.2023.10281461

Bandara, W. G. C., Nair, N. G., & Patel, V. M. (2022). DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Change Detection. ArXiv. https://doi.org/10.48550/arXiv.2206.11892

Bastiaanssen, W. G., Menenti, M., Feddes, R. A., & Holtslag, A. A. M. (1998). A remote sensing surface energy balance algorithm for land (SEBAL). 1. Formulation. Journal of Hydrology, 212, 198-212. https://doi.org/10.1016/S0022-1694(98)00253-4

Bezerra, E., Mafalda, S., Alvarez, A. B., Uman-Flores, D. A., Perez-Torres, W. I., & Palomino-Quispe, F. (2023). A cloud coverage image reconstruction approach for remote sensing of temperature and vegetation in amazon rainforest. Applied Sciences, 13(23), Article 12900. https://doi.org/10.3390/app132312900

Cao, Y., Li, S., Liu, Y., Yan, Z., Dai, Y., Yu, P. S., & Sun, L. (2023). A comprehensive survey of AI-Generated Content (AIGC): A history of generative AI from GAN to ChatGPT. ArXiv. https://doi.org/10.48550/arXiv.2303.04226

Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34, 8780-8794. https://proceedings.neurips.cc/paper_files/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf

Elharrouss, O., Almaadeed, N., Al-Maadeed, S., & Akbari, Y. (2020). Image inpainting: A review. Neural Processing Letters, 51, 2007-2028. https://doi.org/10.1007/s11063-019-10163-0

Hidalgo García, D., & Arco Díaz, J. (2021). Spatial and multi-temporal analysis of land surface temperature through Landsat 8 images: comparison of algorithms in a highly polluted city (Granada). Remote Sensing, 13(5), 1012. https://doi.org/10.3390/rs13051012

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, & H. Lin (Eds.). Advances in Neural Information Processing Systems, 33 (NeurIPS 2020) (pp. 6840-6851). https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

Jing, R., Duan, F., Lu, F., Zhang, M., & Zhao, W. (2023). Denoising diffusion probabilistic feature-based network for cloud removal in Sentinel-2 imagery. Remote Sensing, 15(9), Article 2217. https://doi.org/10.3390/rs15092217

Kawar, B., Elad, M., Ermon, S., & Song, J. (2022). Denoising diffusion restoration models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.). Advances in Neural Information Processing Systems, 35 (NeurIPS 2020), 23593-23606. https://doi.org/10.48550/arXiv.2201.11793

Kortli, Y., Jridi, M., Al Falou, A., & Atri, M. (2020). Face recognition systems: A survey. Sensors, 20(2), Article 342. https://doi.org/10.3390/s20020342

Leher, Q. O. (2024). Inpainting com Modelos Generativos Probabilísticos de Difusão para a Reconstrução de áreas de Interesse em Imagens Satelitais. [Bachelor’s thesis, Universidade Federal do Acre]. UFAC.

Li, X., Ren, Y., Jin, X., Lan, C., Wang, X., Zeng, W., Wang, X. & Chen, Z. (2023). Diffusion models for image restoration and enhancement. A comprehensive survey. ArXiv. https://doi.org/10.48550/arXiv.2308.09388

Liu, J., Yuan, Z., Pan, Z., Fu, Y., Liu, L., & Lu, B. (2022). Diffusion model with detail complement for super-resolution of remote sensing. Remote Sensing, 14(19), article 4834. https://doi.org/10.3390/rs14194834

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., & Van Gool, L. (2022). RePaint: Inpainting using denoising diffusion probabilistic models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , New Orleans, LA, USA, (pp. 11461-11471). https://doi.org/10.48550/arXiv.2201.09865

Muzammal, N. M. Ranasinghe, K., Khan,S., Hayat,M., Khan, F. S., & Yang M.-H. (2021). Intriguing properties of vision transformers. Adv. Neural Info. Process. Syst., 34. https://doi.org/10.48550/arXiv.2105.10497

Murfitt, J., & Duguay, C. R. (2021). 50 years of lake ice research from active microwave remote sensing: Progress and prospects. Remote Sensing of Environment, 264, Article 112616. https://doi.org/10.1016/j.rse.2021.112616

Naseer, M. M., Ranasinghe, K., Khan, S. H., Hayat, M., Khan, F. S., & Yang, M-H. (2021). Intriguing properties of vision transformers. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. Wortman Vaughan (Eds.). Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (pp. 23296-23308). https://proceedings.neurips.cc/paper_files/paper/2021/file/c404a5adbf90e09631678b13b05d9d7a-Paper.pdf

Perez-Torres, W. I., Uman-Flores, D. A., Quispe-Quispe, A. B., Palomino-Quispe, F., Bezerra, E., Leher, Q., Paixão, T. & Alvarez, A. B. (2024). Exploratory analysis using deep learning for water-body segmentation of Peru’s high-mountain remote sensing images. Sensors, 24(16), article 5177. https://doi.org/10.3390/s24165177

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 2022 IEEE/CVF Conference on computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, (10684-10695). https://doi.org/10.48550/arXiv.2112.10752

Singh, A., & Vyas, V. (2022). A review on remote sensing application in river ecosystem evaluation. Spatial Information Research, 30(6), 759-772. https://doi.org/10.1007/s41324-022-00470-5

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015). Deep unsupervised learning using nonequilibrium thermodynamics. Proceedings of Machine Learning Research, 37, (2256-2265). https://proceedings.mlr.press/v37/sohl-dickstein15.html

Tan, H., Wu, S., & Pi, J. (2022). Semantic diffusion network for semantic segmentation. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, & K. Cho and A. Oh (Eds.). Advances in Neural Information Processing Systems 35 (NeurIPS 2020) https://proceedings.neurips.cc/paper_files/paper/2022/file/396446770f5e8496ca1feb02079d4fb7-Paper-Conference.pdf

Taskiran, M., Kahraman, N., & Erdem, C. E. (2020). Face recognition: Past, present and future (a review). Digital Signal Processing, 106, article 102809. https://doi.org/10.1016/j.dsp.2020.102809

Wu, J., Gan, W., Chen, Z., Wan, S., & Lin, H. (2023). Ai-generated content (aigc): A survey. ArXiv. https://doi.org/10.48550/arXiv.2304.06632

Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., Zhang, W., Cui, B., & Yang, M.-H. (2023). Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 56(4), 1-39. https://doi.org/10.1145/3626235

Zhang, S., Li, J., & Yang, L. (2023). Survey on controlable image synthesis with deep learning. ArXiv. https://doi.org/10.48550/arXiv.2307.10275

Zhao, X., & Jia, K. (2023). Cloud removal in remote sensing using sequential-based diffusion models. Remote Sensing, 15(11), Article 2861. https://doi.org/10.3390/rs15112861

Deep Generative AI Based on Denoising Diffusion Probabilistic Models for Applications in Image Processing

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Language

Make a Submission

estadisticas

redesociales

indexacion

etica

normas

equipo

proceso

tutoriales

Latest publications