Challenges of deep learning in computer vision
DOI:
https://doi.org/10.26439/ciis2022.6070Keywords:
computer vision, deep learningAbstract
Computer vision is a field of study within artificial intelligence that focuses on developing computational techniques to perceive the world through visual data, such as video or images. Deep learning has proven to be efficient in visual data analysis and interpretation. Nevertheless, it faces countless challenges, given its application in several computer vision tasks. This panel brings together deep learning experts, who will share information about deep learning applications and challenges to overcome in their research fields regarding computer vision.
Downloads
References
Bajaj, K., Singh, D. K., & Ansari, M. A. (2020). Autoencoders based deep learner for image denoising. Procedia Computer Science, 171, 1535-1541. https://doi.org/10.1016/j.procs.2020.04.164
Cao, Z., Hidalgo, G., Simon, T., Wei, S., & Sheikh, Y. (2021). OpenPose: Realtime multiperson 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(01), 172-286. https://doi.org/10.1109/TPAMI.2019.2929257
Carbune, V., Gonnet, P., Deselaers, T., Rowley, H. A., Daryin, A., Calvo, M., Wang, L.-L., Keysers, D., Feuz, S., & Gervais, P. (2020). Fast multi-language LSTM-based online handwriting recognition. International Journal on Document Analysis and Recognition, 23, 89-102. https://doi.org/10.1007/s10032-020-00350-4
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing, 25, 1-9.
Long, X., Deng, K., Wang, G., Zhang, Y., Dang, Q., Gao, Y., Shen, H., Ren, J., Han, S., Ding, E., & Wen, S. (2020). PP-YOLO: An effective and efficient implementation of object detector. ArXiv e-prints. https://doi.org/10.48550/arXiv.2007.12099
Lu, Y., Wu, S., Tai, YW., & Tang, CK. (2018). Image generation from sketch constraint using contextual GAN. En V. Ferrari, M. Hebert, C. Sminchisescu & Y. Weiss (Eds.), Computer Vision – ECCV 2018. ECCV 2018. Lecture notes in computer science (vol. 11220, pp. 213-228). https://doi.org/10.1007/978-3-030-01270-0_13
Prince, S. (2012). Computer vision: Models, learning, and inference. Cambridge University Press.
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). Generative adversarial text to image synthesis. En Proceedings of the 33rd International Conference on Machine Learning, 48, 1060-1069.
Singh, D., Merdivan, E., Psychoula, I., Kropf, J., Hanke, S., Geist, M., & Holzinger, A. (2017). Human activity recognition using recurrent neural networks. En A. Holzinger, P. Kieseberg, A. Tjoa & E. Weippl (Eds.), Machine Learning and Knowledge Extraction.
CD-MAKE 2017. Lecture notes in computer science (vol. 10410, pp. 267-274). https://doi.org/10.1007/978-3-319-66808-6_18
Yu, J., Wang, Z., Vasudevan, V., Yeung, L., Seyedhosseini, M., & Wu, Y. (2022). CoCa: Contrastive captioners are image-text foundation models. ArXiv e-prints. https://doi.org/10.48550/arXiv.2205.01917
Zoph, B., Ghiasi, G., Lin, TY., Cui, Y., Liu, H., Cubuk, E. D., & Le, Q. (2020). Rethinking pre-training and self-training. Advances in Neural Information Processing, 33, 1-13.