Dynamic Malware Analysis Using Machine Learning-Based Detection Algorithms

Erly Galia  Villarroel Enriquez; Juan Gutiérrez-Cárdenas

doi:10.26439/interfases2024.n19.7097

Authors

Erly Galia Villarroel Enriquez Facultad de Ingeniería, Universidad de Lima https://orcid.org/0000-0001-8566-0494
Juan Gutiérrez-Cárdenas Facultad de Ingeniería, Universidad de Lima https://orcid.org/0000-0003-2566-4690

DOI:

https://doi.org/10.26439/interfases2024.n19.7097

Keywords:

malware, machine learning, detección

Abstract

Con la creciente popularidad del uso de teléfonos celulares, el riesgo de infecciones por malware en dichos dispositivos ha aumentado, lo que genera pérdidas financieras tanto para individuos como para organizaciones. Las investigaciones actuales se centran en la aplicación del aprendizaje automático para la detección y clasificación de estos programas malignos. Debido a esto el presente trabajo utiliza la frecuencia de llamadas al sistema para detectar y clasificar malware utilizando los algoritmos XGBoost, LightGBM y random forest. Los resultados más altos se obtuvieron con el algoritmo de LightGBM, logrando un 94.1% de precisión y 93.9% tanto para exactitud, recall y f1-score, lo que demuestra la efectividad tanto del uso del aprendizaje automático como del uso de comportamientos dinámicos del malware para la mitigación de amenazas de seguridad en dispositivos móviles.

Downloads

Download data is not yet available.

Author Biographies

Erly Galia Villarroel Enriquez, Facultad de Ingeniería, Universidad de Lima

Es licenciada en Ingeniería de Sistemas por la Universidad de Lima, obtenida en 2023. En 2022, participó en el Congreso Internacional de Ingeniería de Sistemas celebrado en el campus de la Universidad de Lima, donde presentó un afiche que resumía una revisión de literatura sobre ofuscación de malware. Sus intereses de investigación se centran principalmente en las aplicaciones del aprendizaje automático y el aprendizaje profundo para abordar los desafíos actuales.
Juan Gutiérrez-Cárdenas, Facultad de Ingeniería, Universidad de Lima

Obtuvo su doctorado en Informática en la subdisciplina de Bioinformática en la Universidad de Sudáfrica. También tiene una maestría en Informática-Bioinformática de la Universidad de Helsinki y un Diplomado Superior en Informática de la Universidad de Witwatersrand. Es profesor a tiempo parcial en la Universidad de Lima en el área de Ingeniería de Software. Ha sido un activo defensor del desarrollo de habilidades de investigación en estudiantes de pregrado, lo que ha dado lugar a una variedad de publicaciones revisadas por pares realizadas por estudiantes en la carrera en la que actualmente trabaja. También se desempeña como evaluador de programas ABET y es miembro de la Sociedad Peruana de Informática. Entre sus intereses de investigación se encuentran los métodos de aprendizaje automático aplicados a la bioinformática, el procesamiento de lenguaje natural y la educación en ciencias de la computación.

References

Aebersold, S., Kryszczuk, K., Paganoni, S., Tellenbach, B., & Trowbridge, T. (2016). Detecting obfuscated JavaScripts using machine learning. ICIMP 2016 the Eleventh International Conference on Internet Monitoring and Protection: May 22-26, 2016, Valencia, Spain, 1, 11–17. https://doi.org/10.21256/zhaw-3848

Ashik, M., Jyothish, A., Anandaram, S., Vinod, P., Mercaldo, F., Martinelli, F., & Santone, A. (2021). Detection of malicious software by analyzing distinct artifacts using machine learning and deep learning algorithms. Electronics, 10(14), 1694. https://doi.org/10.3390/electronics10141694

Aslan, Ö. A., & Samet, R. (2020). A comprehensive review on malware detection approaches. IEEE Access, 8, 6249-6271. https://doi.org/10.1109/ACCESS.2019.2963724

Bhatia, T., & Kaushal, R. (2017). Malware detection in android based on dynamic analysis. 2017 International Conference on Cyber Security and Protection Of Digital Services (Cyber Security), London, UK, 1-6. https://doi.org/10.1109/CyberSecPODS.2017.8074847

Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., & Li, B. (2018). Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. Computers and Security, 73, 326-344. https://doi.org/10.1016/j.cose.2017.11.007

Chen, Y. C., Chen, H. Y., Takahashi, T., Sun, B., & Lin, T. N. (2021). Impact of code deobfuscation and feature interaction in android malware detection. IEEE Access, 9, 123208-123219. https://doi.org/10.1109/ACCESS.2021.3110408

Chen, Z., & Ren, X. (2023). An efficient boosting-based windows malware family classification system using multi-features fusion. Applied Sciences, 13(6), 4060. https://doi.org/10.3390/app13064060

Choudhary, S., & Sharma, A. (2020, February). Malware detection & classification using machine learning. 2020 International Conference on Emerging Trends in Communication, Control and Computing (ICONC3), Lakshmangarh, India, 1-4. https://doi.org/10.1109/ICONC345789.2020.9117547

Dhamija, H., & Dhamija, A. K. (2021). Malware detection using machine learning classification algorithms. International Journal of Computational Intelligence Research (IJCIR), 17(1), 1-7. https://www.ripublication.com/ijcir21/ijcirv17n1_01.pdf

Duo, W., Zhou, M., & Abusorrah, A. (2022). A survey of cyber attacks on cyber physical systems: Recent advances and challenges. IEEE/CAA Journal of Automatica Sinica, 9(5), 784-800. https://doi.org/10.1109/JAS.2022.105548

Feng, P., Ma, J., Sun, C., Xu, X., & Ma, Y. (2018). A novel dynamic android malware detection system with ensemble learning. IEEE Access, 6, 30996-31011. https://doi.org/10.1109/ACCESS.2018.2844349

Fortinet. (2022, February 8). América Latina sufrió más de 289 mil millones de intentos de ciberataques en 2021 [Press release]. https://www.fortinet.com/lat/corporate/about-us/newsroom/press-releases/2022/fortiguard-labs-reporte-ciberataques-america-latina-2021

Gao, Y., Hasegawa, H., Yamaguchi, Y., & Shimada, H. (2022). Malware detection using LightGBM with a custom logistic loss function. IEEE Access, 10, 47792-47804. https://doi.org/10.1109/ACCESS.2022.3171912

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. https://proceedings.neurips.cc/paper_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf

Kim, S., Hong, S., Oh, J., & Lee, H. (2018, June). Obfuscated VBA macro detection using machine learning. 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Luxembourg, Luxembourg, 490-501. https://doi.org/10.1109/DSN.2018.00057

Kshirsagar, D., & Agrawal, P. (2022). A study of feature selection methods for android malware detection. Journal of Information and Optimization Sciences, 43(8), 2111-2120. https://doi.org/10.1080/02522667.2022.2133218

Kumar, R., & Geetha, S. (2020). Malware classification using XGboost-Gradient boosted decision tree. Advances in Science, Technology and Engineering Systems Journal, 5(5), 536–549. https://doi.org/10.25046/aj050566

Mahdavifar, S., Kadir, A. F. A., Fatemi, R., Alhadidi, D., & Ghorbani, A. A. (2020). Dynamic android malware category classification using semi-supervised deep learning. 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Calgary, AB, Canada, 515-522. https://doi.org/10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00094

Mahindru, A., & Sangal, A. L. (2021). MLDroid—Framework for Android malware detection using machine learning techniques. Neural Computing and Applications, 33, 5183-5240. https://doi.org/10.1007/s00521-020-05309-4

Liu, K., Xu, S., Xu, G., Zhang, M., Sun, D., & Liu, H. (2020). A review of android malware detection approaches based on machine learning. IEEE Access, 8, 124579-124607. https://doi.org/10.1109/ACCESS.2020.3006143

Louk, M. H. L., & Tama, B. A. (2022). Tree-based classifier ensembles for PE malware analysis: A performance revisit. e, 15 (9), 332. https://doi.org/10.3390/a15090332

Onoja, M., Jegede, A., Blamah, N., Abimbola, O. V., & Omotehinwa, T. O. (2022). EEMDS: Efficient and effective malware detection system with hybrid model based on xceptioncnn and lightgbm algorithm. Journal of Computing and Social Informatics, 1(2), 42-57. https://doi.org/10.33736/jcsi.4739.2022

Palša, J., Ádám, N., Hurtuk, J., Chovancová, E., Madoš, B., Chovanec, M., & Kocan, S. (2022). MLMD—A malware-detecting antivirus tool based on the XGBoost machine learning algorithm. Applied Sciences, 12(13), 6672. https://doi.org/10.3390/app12136672

Şahın, D. Ö., Akleylek, S., & Kiliç, E. (2022). LinRegDroid: Detection of Android malware using multiple linear regression models-based classifiers. IEEE Access, 10, 14246–14259. https://doi.org/10.1109/ACCESS.2022.3146363

Sönmez, Y., Salman, M., & Dener, M. (2021). Performance analysis of machine learning algorithms for malware detection by using CICMalDroid2020 dataset. Düzce University Journal of Science and Technology, 9(6), 280-288. https://doi.org/10.29130/dubited.1018223

Surendran, R., & Thomas, T. (2022). Detection of malware applications from centrality measures of syscall graph. Concurrency and Computation: Practice and Experience, 34(10). https://doi.org/10.1002/cpe.6835

Surendran, R., Thomas, T., & Emmanuel, S. (2020). On existence of common malicious system call codes in android malware families. IEEE Transactions on Reliability, 70(1), 248-260. https://doi.org/10.1109/TR.2020.2982537

Urooj, B., Shah, M. A., Maple, C., Abbasi, M. K., & Riasat, S. (2022). Malware detection: a framework for reverse engineered android applications through machine learning algorithms. IEEE Access, 10, 89031-89050. https://doi.org/10.1109/ACCESS.2022.3149053

Wu, B., Chen, S., Gao, C., Fan, L., Liu, Y., Wen, W., & Lyu, M. R. (2021). Why an android app is classified as malware: Toward malware classification interpretation. ACM Transactions on Software Engineering and Methodology (TOSEM), 30(2), 1-29. https://doi.org/10.1145/3423096

Dynamic Malware Analysis Using Machine Learning-Based Detection Algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

References

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

Language

Make a Submission

estadisticas

redesociales

indexacion

etica

normas

equipo

proceso

tutoriales

Latest publications