Underwater Plastic Waste Detection with YOLO and Vision Transformer Models
DOI:
https://doi.org/10.26439/interfases2025.n021.7868Keywords:
object detection, deep learning, plastic waste, object detection model, underwater imagesAbstract
This study addresses the global issue of marine pollution, with a particular focus on plastic bag contamination, by leveraging real-time object detection techniques powered by deep learning algorithms. A detailed comparison was carried out between the YOLOv8, YOLO-NAS, and RT-DETR models to assess their effectiveness in detecting plastic waste in underwater environments. The methodology encompassed several key stages, including data preprocessing, model implementation, and training through transfer learning. Evaluation was conducted using a simulated video environment,
followed by an in-depth comparison of the results. Performance assessment was based on critical metrics such as mean average precision (mAP), recall, and inference time. The YOLOv8 model achieved an mAP50 of 0.921 on the validation dataset, along with a recall of 0.829 and an inference time of 14.1 milliseconds. The YOLO-NAS model, by contrast, reached an mAP50 of 0.813, a higher recall of 0.903, and an inference time of 17.8 milliseconds. The RT-DETR model obtained an mAP50 of 0.887, a recall of 0.819, and an inference time of 15.9 milliseconds. Notably, despite not having the highest mAP, the RT-DETR model demonstrated superior detection performance when deployed in real-world underwater conditions, highlighting its robustness and potential for practical environmental monitoring.
Downloads
References
Casas, E., Ramos, L., Bendek, E., & Rivas-Echeverría, F. (2023). Assessing the effectiveness of YOLO architectures for smoke and wildfire detection. IEEE Access, 11, 96554–96583. https://doi.org/10.1109/access.2023.3312217
Conley, G., Zinn, S. C., Hanson, T., McDonald, K., Beck, N., & Wen, H. (2022). Using a deep learning model to quantify trash accumulation for cleaner urban stormwater. Computers, Environment and Urban Systems, 93, Article 101752. https://doi.org/10.1016/j.compenvurbsys.2021.101752
Deng, H., Ergu, D., Liu, F., Ma, B., & Cai, Y. (2021). An embeddable algorithm for automatic garbage detection based on complex marine environment. Sensors, 21(19), Article 6391. https://doi.org/10.3390/s21196391
Dhana Lakshmi, M., & Santhanam, S. M. (2020). Underwater image recognition detector using deep ConvNet. 2020 National Conference on Communications (NCC), Kharagpur, India, pp. 1–6. http://doi.org/10.1109/NCC48643.2020.9056058
Fulton, M., Hong, J., Islam, M. J., & Sattar, J. (2019). Robotic detection of marine litter using deep visual detection models. 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, pp. 5752–5758. https://doi.org/10.1109/ICRA.2019.8793975
Han, F., Yao, J., Zhu, H., & Wang, C. (2020). Underwater image processing and object detection based on deep CNN method. Journal of Sensors, 2020, Article 6707328, 1–20. https://doi.org/10.1155/2020/6707328
Hong, J., Fulton, M., & Sattar, J. (2020). Trashcan: A semantically-segmented dataset towards visual detection of marine debris. arXiv. https://doi.org/10.48550/arXiv.2007.08097
Kavitha, P. M., Anitha, M., Padhiari, S., & Anitha, K. (2022). Detection of trash in sea using deep learning. YMER Digital, 21(7), 817–822. https://doi.org/10.37896/ymer21.07/65
Li, Y., Fan, Q., Huang, H., Han, Z., & Gu, Q. (2023). A modified YOLOv8 detection network for UAV aerial image recognition. Drones, 7(5), Article 304. https://doi.org/10.3390/drones7050304
Maurício, J., Domingues, I., & Bernardino, J. (2023). Comparing vision transformers and convolutional neural networks for image classification: A literature review. Applied Sciences, 13(9), Article 5521. https://doi.org/10.3390/app13095521
Moorton, Z., Kurt, Z., & Woo, W. L. (2022). Is the use of deep learning an appropriate means to locate debris in the ocean without harming aquatic wildlife? Marine Pollution Bulletin, 118, Article 113853. https://doi.org/10.1016/j.marpolbul.2022.113853
Muksit, A. A., Hasan, F., Hasan Bhuiyan Emon, M. F., Haque, M. R., Anwary, A. R., & Shatabda, S. (2022). YOLO-Fish: A robust fish detection model to detect fish in realistic underwater environment. Ecological Informatics, 72, Article 101847. https://doi.org/10.1016/j.ecoinf.2022.101847
National Institute of Statistics and Informatics. (2022). Technical report: Environmental statistics for January 2022. https://cdn.www.gob.pe/uploads/document/file/3067334/Estad%C3%ADsticas%20Ambientales%3A%20Enero%202022.pdf?v=1651873783
Panwar, H., Gupta, P. K., Siddiqui, M. K., Morales-Menendez, R., Bhardwaj, P., Sharma, S., & Sarker, I. H. (2020). AquaVision: Automating the detection of waste in water bodies using deep transfer learning. Case Studies in Chemical and Environmental Engineering, 2, Article 100026. https://doi.org/10.1016/j.cscee.2020.100026
Reis, D., Kupec, J., Hong, J., & Daoudi, A. (2023). Real-time flying object detection with YOLOv8. arXiv. https://doi.org/10.48550/arXiv.2305.09972
Rizos, P., & Kalogeraki, V. (2021). Deep learning for underwater object detection. Proceedings of the 24th Pan-Hellenic Conference on Informatics (PCI ‘20), New York, USA, pp. 175–177. https://doi.org/10.1145/3437120.3437301
Teng, X., Fei, Y., He, K., & Lu, L. (2022). The object detection of underwater garbage with an improved YOLOv5 algorithm. Proceedings of the 2022 International Conference on Pattern Recognition and Intelligent Systems (PRIS ‘22), New York, USA, pp. 55–60. https://doi.org/10.1145/3549179.3549189
Terven, J., & Córdova-Esparza, D. M. (2023). A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLOv8 and YOLONAS. Machine Learning and Knowledge Extraction, 5(4), 1680–1716. https://doi.org/10.3390/make5040083
Uparkar, O., Bharti, J., Pateriya, R. K., Gupta, R. K., & Sharma, A. (2023). Vision transformer outperforms deep convolutional neural network-based model in classifying X-ray images. Procedia Computer Science, 218, 2338–2349. https://doi.org/10.1016/j.procs.2023.01.209
Wu, Y.-C., Shih, P.-Y., Chen, L.-P., Wang, C-.C., & Samani, H. (2020). Towards underwater sustainability using ROV equipped with deep learning system. 2020 International Automatic Control Conference (CACS), Hsinchu, Taiwan, pp. 1–5. http://doi.org/10.1109/CACS50047.2020.9289788
Xue, B., Huang, B., Wei, W., Chen, G., Li, H., Zhao, N., & Zhan, H. (2021a). An efficient deep-sea debris detection method using deep neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 12348–12360. http://doi.org/10.1109/JSTARS.2021.3130238
Xue, B., Huang, B., Chen, G., Li, H., & Wei, W. (2021b). Deep-sea debris identification using deep convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 8909–8921 http://doi.org/10.1109/JSTARS.2021.3107853
Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., Liu, Y., & Chen, J. (2023). DETRs beat YOLOs on real-time object detection. arXiv, 2304.08069. https://doi.org/10.48550/arXiv.2304.08069
Zhou, H., Huang, H., Yang, X., Zhang, L., & Qi, L. (2017). Faster R-CNN for marine organism detection and recognition using data augmentation. Proceedings of the International Conference on Video and Image Processing - ICVIP 2017. https://doi.org/10.1145/3177404.3177433
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under an Attribution 4.0 International (CC BY 4.0) License. that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Last updated 03/05/21
