A comparison of machine learning techniques for detection of phishing websites

Authors

  • Andrés Eduardo Moncada Vargas Universidad de Lima

DOI:

https://doi.org/10.26439/interfases2020.n013.4886

Keywords:

Anti-Phishing, Machine Learning, Cibersecurity, Phishing Warning, Phishing, Ciberattack

Abstract

Phishing is the theft of personal data through fake websites. Victims of this type of theft are directed to a fake website, where they are asked to enter their data to validate their identity. At that moment, theft is carried out, since entered data are stored and used by the hacker responsible for said attack to sell them or enter to websites and perform a fraud or scam. In order to conduct this work, we researched different methods for detecting phishing websites by using machine learning techniques. Thus, the purpose of this work is to compare machine learning techniques that have demonstrated to be the most effective methods to detect phishing websites. The results show that decision tree classifiers such as Decision Tree and Random Forest have achieved the highest accuracy and efficacy rates, with values between 97% and 99%, in detecting these types of websites.

Downloads

Download data is not yet available.

References

Abdelhamid, N., Thabtah, F. y Abdel-jaber, H. (2017). Phishing Detection: A Recent Intelligent machine learning Comparison based on Models Content and Features. IEEE Explorer, 6. doi:10.1109/ISI.2017.8004877

Abu-Nimeh, S., Nappa, D., Wang, X. y Nair, S. (2007). A Comparison of machine learning Techniques for Phishing Detection. ACM Digital Library, 10. doi:10,.1145/1299015.1299021

Al-Janabi, M., De Quincey, E. y Andras, P. (2017). Using Supervised Machine Learning Algorithms to Detect Suspicious URLs in Online Social Networks. ACM Digital Library, 8. doi:10.1145/3110025.3116201

Bulakh, V. y Gupta, M. (2016). Countering Phishing from Brands’ Vantage Point. ACM Digital Library, 8. doi:10.1145/2875475,2875478 Campo, D. (20 de noviembre de 2017). MachineLearningPhishing. GitHub. Recuperado de https://github.com/diegoocampoh/MachineLearningPhishing

Chen, T.-C., Dick, S. y Miller, J. (2010). Detecting Visually Similar Web Pages: Application to Phishing Detection. ACM Digital Library, 38. doi:10.1145/3282373.3282422

Chiew, K. L., Tan, C. L., Wong, K. S., Yong, K. S. y Tiong, W. K. (2019). A New Hybrid Ensemble Feature Selection Framework for Machine Learning-Based Phishing Detection System. Science Direct, 14. doi:10.1016/j.ins2019.01.064

Cuzzocrea, A., Martinelli, F., y Mercaldo, F. (2018). Applying Machine Learning Techniques to Detect and Analyze Web Phishing Attacks. ACM Digital Library, 5. doi:10,1145/3282373,3282422

ESET Security Report Latinoamérica 2017. (2017). Recuperado de https://www.welivesecurity.com/wpcontent/uploads/2017/04/eset-security-report-2017.pdf

Hota, H. S., Shrivas, A. K. y Hota, R. (2018). An Ensemble Model for Detecting Phishing Attack with Proposed Remove-Replace Feature Selection Technique. Science Direct, 8. doi:10.1016/j.procs.2018.05.103

Islam Mamun, M. S., Rathore, M. A., Lashkari, A. H., Stakhanova, N. y Ghorbani, A. A. (2016). Detecting Malicious URLs Using Lexical Analysis. Springer Link, 16. doi:10,1007/978-3-319-46298-1_30

Jain, A. K. y Gupta, B. B. (2016). A novel Approach to Protect against Phishing Attacks at Client Side Using Auto-Updated White-List. Springer Open, 11. doi:10.1186/ s13635-016-0034-3

Mao, J., Bian, J., Tian, W., Zhu, S., Wei, T., Li, A., y Liang, Z, (2018), Detecting Phishing Websites via Aggregation Analysis of Page Layouts. Science Direct, 7, doi:10,1016/j,procs,2018,03,053

Medvet, E., Kirda, E. y Kruegel, C. (2008). Visual-Similarity-Based Phishing Detection. ACM Digital Library, 6. doi:10.1145/1460877.1460905

Mitchell, T. M. (1997). Machine Learning. New York: McGraw-Hill Science.

Mourtaji, Y., Bouhorma, P. y Alghazzawi, P. (2017). Perception of a New Framework for Detecting Phishing Web Pages. ACM Digital Library, 6. doi:10.1145/3175628.3175633

Rajab, M. (2018). An Anti-Phishing Method based on Feature Analysis. ACM Digital Library, 7. doi:10.1145/3184066.3184082

Sanglerdsinlapachai, N. y Rungsawang, A. (2010). Web Phishing Detection Using Classifier Ensemble. ACM Digital Library, 6. doi:10.1145/1967486,1967521

Tan, C. L. (2018). Phishing Dataset for Machine Learning: Feature Evaluation. Mendeley. doi:10.17632/h3cgnj8hft.1

URL dataset (ISCX-URL-2016). (2016). UNB. Recuperado de https://www.unb.ca/cic/datasets/url-2016.html

Downloads

Published

2020-12-22

Issue

Section

Research papers

How to Cite

A comparison of machine learning techniques for detection of phishing websites. (2020). Interfases, 13(013), 77-103. https://doi.org/10.26439/interfases2020.n013.4886