Deep Generative AI Based on Denoising Diffusion Probabilistic Models for Applications in Image Processing

Autores

DOI:

https://doi.org/10.26439/interfases2024.n020.7389

Palavras-chave:

machine learning, reconstruction, segmentation, face and gesture recognition, remote sensing

Resumo

Os Denoising Diffusion Probabilistic Models (DDPMs) têm demonstrado um potencial significativo na resolução de problemas complexos de processamento de imagem. Neste estudo foi explorada a utilização dos DDPMs em três aplicações distintas, incluindo, reconstrução de imagens de sensoriamento remoto em áreas com cobertura de nuvens, reconstrução de imagens faciais com regiões ocluídas e segmentação de corpos d'água a partir de imagens de sensoriamento remoto. A técnica Inpainting consiste em preencher regiões faltantes em imagens, por outro lado os DDPMs atuam como geradores de dados capazes de sintetizar informações coerentes com o contexto do dado original. Nesse contexto, inspirados na técnica inpainting, a abordagem RePaint foi adaptada e aplicada para as tarefas de reconstrução. Já para a tarefa de segmentação foi utilizada a técnica WaterSegDiff que também utiliza um modelo de difusão como backbonner. Para ilustrar o comportamento do modelo e exemplificar as tarefas foram realizados experimentos cujo desempenho foi avaliado qualitativa e quantitativamente. Os resultados das avaliações qualitativas evidenciam a capacidade do modelo em gerar dados para reconstrução e segmentação. Quantitativamente as métricas MSE, PSNR, SSIM, IoU, PA e F1-Score indicam o ótimo desempenho dos modelos em tarefas de processamento de imagens. Nesse cenário, os DDPMs demonstraram ser uma ferramenta promissora para a reconstrução de dados com alta qualidade, permitindo alucinação de imagens com alta coerência visual e aplicações em diversas áreas, como monitoramento ambiental, reconhecimento facial, mapeamento de recursos hídricos, entre outros.

Downloads

Os dados de download ainda não estão disponíveis.

Referências

Adrian, R., O’Reilly, C. M., Zagarese, H., Baines, S. B., Hessen, D. O., Keller, W., David M. Livingstone, D. M., Sommaruga, R., Straile, D., Van Donk, E., Weyhenmeyer, G. A., & Winder, M. (2009). Lakes as sentinels of climate change. Limnology and oceanography, 54(6 part 2), 2283-2297. https://doi.org/10.4319/lo.2009.54.6_part_2.2283

Ali, W., Tian, W., Din, S. U., Iradukunda, D., & Khan, A. A. (2021). Classical and modern face recognition approaches: a complete review. Multimedia tools and applications, 80, 4825-4880. https://doi.org/10.1007/s11042-020-09850-1

Alves, U. D. S. (2024). Reconstrução de áreas ausentes em imagens faciais usando a técnica de Inpainting baseada em modelo de difusão. [Master’s thesis, Universidade Federal do Acre], UFAC.

Amit, T., Shaharbany, T., Nachmani, E., & Wolf, L. (2021). Segdiff: Image segmentation with diffusion probabilistic models. ArXiv. https://doi.org/10.48550/arXiv.2112.00390

Avrahami, O., Lischinski, D., & Fried, O. (2022). Blended diffusion for text-driven editing of natural images. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orlean, LA, USA, 18208 – 18218. https://doi.org/10.48550/arXiv.2111.14818

Awais, M., Li, W., Hussain, S., Cheema, M. J. M., Li, W., Song, R., & Liu, C. (2022). Comparative evaluation of land surface temperature images from unmanned aerial vehicle and satellite observation for agricultural areas using in situ data. Agriculture, 12(2), 184. https://doi.org/10.3390/agriculture12020184

Ayala, C., Sesma, R., Aranda, C., & Galar, M. (2023). Diffusion models for remote sensing imagery semantic segmentation. IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA (5654 – 5657). https://doi.org/10.1109/IGARSS52108.2023.10281461

Bandara, W. G. C., Nair, N. G., & Patel, V. M. (2022). DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Change Detection. ArXiv. https://doi.org/10.48550/arXiv.2206.11892

Bastiaanssen, W. G., Menenti, M., Feddes, R. A., & Holtslag, A. A. M. (1998). A remote sensing surface energy balance algorithm for land (SEBAL). 1. Formulation. Journal of Hydrology, 212, 198-212. https://doi.org/10.1016/S0022-1694(98)00253-4

Bezerra, E., Mafalda, S., Alvarez, A. B., Uman-Flores, D. A., Perez-Torres, W. I., & Palomino-Quispe, F. (2023). A cloud coverage image reconstruction approach for remote sensing of temperature and vegetation in amazon rainforest. Applied Sciences, 13(23), Article 12900. https://doi.org/10.3390/app132312900

Cao, Y., Li, S., Liu, Y., Yan, Z., Dai, Y., Yu, P. S., & Sun, L. (2023). A comprehensive survey of AI-Generated Content (AIGC): A history of generative AI from GAN to ChatGPT. ArXiv. https://doi.org/10.48550/arXiv.2303.04226

Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34, 8780-8794. https://proceedings.neurips.cc/paper_files/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf

Elharrouss, O., Almaadeed, N., Al-Maadeed, S., & Akbari, Y. (2020). Image inpainting: A review. Neural Processing Letters, 51, 2007-2028. https://doi.org/10.1007/s11063-019-10163-0

Hidalgo García, D., & Arco Díaz, J. (2021). Spatial and multi-temporal analysis of land surface temperature through Landsat 8 images: comparison of algorithms in a highly polluted city (Granada). Remote Sensing, 13(5), 1012. https://doi.org/10.3390/rs13051012

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, & H. Lin (Eds.). Advances in Neural Information Processing Systems, 33 (NeurIPS 2020) (pp. 6840-6851). https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

Jing, R., Duan, F., Lu, F., Zhang, M., & Zhao, W. (2023). Denoising diffusion probabilistic feature-based network for cloud removal in Sentinel-2 imagery. Remote Sensing, 15(9), Article 2217. https://doi.org/10.3390/rs15092217

Kawar, B., Elad, M., Ermon, S., & Song, J. (2022). Denoising diffusion restoration models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.). Advances in Neural Information Processing Systems, 35 (NeurIPS 2020), 23593-23606. https://doi.org/10.48550/arXiv.2201.11793

Kortli, Y., Jridi, M., Al Falou, A., & Atri, M. (2020). Face recognition systems: A survey. Sensors, 20(2), Article 342. https://doi.org/10.3390/s20020342

Leher, Q. O. (2024). Inpainting com Modelos Generativos Probabilísticos de Difusão para a Reconstrução de áreas de Interesse em Imagens Satelitais. [Bachelor’s thesis, Universidade Federal do Acre]. UFAC.

Li, X., Ren, Y., Jin, X., Lan, C., Wang, X., Zeng, W., Wang, X. & Chen, Z. (2023). Diffusion models for image restoration and enhancement. A comprehensive survey. ArXiv. https://doi.org/10.48550/arXiv.2308.09388

Liu, J., Yuan, Z., Pan, Z., Fu, Y., Liu, L., & Lu, B. (2022). Diffusion model with detail complement for super-resolution of remote sensing. Remote Sensing, 14(19), article 4834. https://doi.org/10.3390/rs14194834

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., & Van Gool, L. (2022). RePaint: Inpainting using denoising diffusion probabilistic models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , New Orleans, LA, USA, (pp. 11461-11471). https://doi.org/10.48550/arXiv.2201.09865

Muzammal, N. M. Ranasinghe, K., Khan,S., Hayat,M., Khan, F. S., & Yang M.-H. (2021). Intriguing properties of vision transformers. Adv. Neural Info. Process. Syst., 34. https://doi.org/10.48550/arXiv.2105.10497

Murfitt, J., & Duguay, C. R. (2021). 50 years of lake ice research from active microwave remote sensing: Progress and prospects. Remote Sensing of Environment, 264, Article 112616. https://doi.org/10.1016/j.rse.2021.112616

Naseer, M. M., Ranasinghe, K., Khan, S. H., Hayat, M., Khan, F. S., & Yang, M-H. (2021). Intriguing properties of vision transformers. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. Wortman Vaughan (Eds.). Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (pp. 23296-23308). https://proceedings.neurips.cc/paper_files/paper/2021/file/c404a5adbf90e09631678b13b05d9d7a-Paper.pdf

Perez-Torres, W. I., Uman-Flores, D. A., Quispe-Quispe, A. B., Palomino-Quispe, F., Bezerra, E., Leher, Q., Paixão, T. & Alvarez, A. B. (2024). Exploratory analysis using deep learning for water-body segmentation of Peru’s high-mountain remote sensing images. Sensors, 24(16), article 5177. https://doi.org/10.3390/s24165177

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 2022 IEEE/CVF Conference on computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, (10684-10695). https://doi.org/10.48550/arXiv.2112.10752

Singh, A., & Vyas, V. (2022). A review on remote sensing application in river ecosystem evaluation. Spatial Information Research, 30(6), 759-772. https://doi.org/10.1007/s41324-022-00470-5

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015). Deep unsupervised learning using nonequilibrium thermodynamics. Proceedings of Machine Learning Research, 37, (2256-2265). https://proceedings.mlr.press/v37/sohl-dickstein15.html

Tan, H., Wu, S., & Pi, J. (2022). Semantic diffusion network for semantic segmentation. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, & K. Cho and A. Oh (Eds.). Advances in Neural Information Processing Systems 35 (NeurIPS 2020) https://proceedings.neurips.cc/paper_files/paper/2022/file/396446770f5e8496ca1feb02079d4fb7-Paper-Conference.pdf

Taskiran, M., Kahraman, N., & Erdem, C. E. (2020). Face recognition: Past, present and future (a review). Digital Signal Processing, 106, article 102809. https://doi.org/10.1016/j.dsp.2020.102809

Wu, J., Gan, W., Chen, Z., Wan, S., & Lin, H. (2023). Ai-generated content (aigc): A survey. ArXiv. https://doi.org/10.48550/arXiv.2304.06632

Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., Zhang, W., Cui, B., & Yang, M.-H. (2023). Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 56(4), 1-39. https://doi.org/10.1145/3626235

Zhang, S., Li, J., & Yang, L. (2023). Survey on controlable image synthesis with deep learning. ArXiv. https://doi.org/10.48550/arXiv.2307.10275

Zhao, X., & Jia, K. (2023). Cloud removal in remote sensing using sequential-based diffusion models. Remote Sensing, 15(11), Article 2861. https://doi.org/10.3390/rs15112861

Publicado

2024-12-26

Edição

Seção

Artículos de investigación

Como Citar

Deep Generative AI Based on Denoising Diffusion Probabilistic Models for Applications in Image Processing. (2024). Interfases, 020, 71-93. https://doi.org/10.26439/interfases2024.n020.7389