Big data in the retail world: customer segmentation and recommender system in a European supermarket chain
DOI:
https://doi.org/10.26439/ing.ind2022.n.5808Keywords:
retail, segmentation, recommender system, big data, machine learningAbstract
In this research, we want to present the concepts and techniques used in a big data project for a European supermarket company, through a customer segmentation proposal, using the k-means algorithm, and a recommender system, via Light FM library. The main conclusions include the importance of adequately defining the problem to be solved, the correct use of the big data infrastructure, the relevance of the exploratory analysis of the dataset and its pre-processing, as well as the use of the TDSP methodology (Team Data Science Process), oriented to big data projects.
Downloads
References
Aryuni, M., Didik Madyatmadja, E., & Miranda, E. (2018). Customer segmentation in XYZ Bank using k-means and k-medoids clustering [Presentación de paper]. Proceedings of 2018 International Conference on Information Management and Technology, ICIMTech 2018. Jakarta, Indonesia. https://doi.org/10.1109/ICIMTech.2018.8528086
Cam, C., Hidalgo, G., Huérfano, C., & Medina, J. (2020). Memoria trabajo final. Máster en Big Data Engineer. Universidad de Barcelona.
Chen, D., Sain, S. L., & Guo, K. (2012). Data mining for the online retail industry: a case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing & Customer Strategy Management, 19(3), 197-208. https://doi.org/10.1057/dbm.2012.17
Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS Quarterly, 36(4), 1165-1188. https://doi.org/10.2307/41703503
Christodoulou, P., Christodoulou, K., & Andreou, A. S. (2017). A real-time targeted recommender system for supermarkets. En Proceedings of the 19th International Conference on Enterprise Information Systems. Volumen 2: ICEIS 2017 (pp. 703-712). https://doi.org/10.5220/0006309907030712
Doğan, O., Ayçin, E., & Bulut, Z. A. (2018). Customer segmentation by using RFM model and clustering methods: a case study in retail industry. International Journal of Contemporary Economics and Administrative Sciences, 8(1), 1-19. http://www.ijceas.com/index.php/ijceas/article/view/174
Falk, K. (2019). Practical recommender systems. Manning.
Fang, Y., Xiao, X., Wang, X., & Lan, H. (2018). Customized bundle recommendation by association rules of product categories for online supermarkets. En 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC) (pp. 472-475). https://doi.org/10.1109/DSC.2018.00076
Gulabani, S. (2017). Practical Amazon EC2, SQS, Kinesis, and S3: A hands-on to AWS. Apress. https://doi.org/10.1007/978-1-4842-2841-8
Kansal, T., Bahuguna, S., Singh, V., & Choudhury, T. (2018). Customer segmentation using k-means clustering. En Proceedings of the International Conference on Computational Techniques, Electronics and Mechanical Systems, CTEMS 2018 (pp. 135-139). https://doi.org/10.1109/CTEMS.2018.8769171
Kumar, V., & Reinartz, W. (2018). Customer relationship management: concept, strategy, and tools (3.a ed.). Springer. https://doi.org/10.1108/IJBM-11-2014-0160
Lycett, M. (2013). “Datafication”: making sense of (big) data in a complex world. European Journal of Information Systems, 22(4), 381-386. https://doi.org/10.1057/ejis.2013.10
Microsoft. (2021, 11 de diciembre). What is team data science process? https://docs.microsoft.com/en-us/azure/architecture/data-science-process/overview
Pascal, C., Ozuomba, S., & Kalu, C. (2015). Application of k-means algorithm for efficient customer segmentation: a strategy for targeted customer services. International Journal of Advanced Research in Artificial Intelligence, 4(10), 40-44. https://doi.org/10.14569/ijarai.2015.041007
Pérez, C. (2013). Análisis multivariante de datos. Aplicaciones con IBM SPSS, SAS y STATGRAPHICS (1.a ed.). Garceta.
Schermann, M., Hemsen, H., Buchmüller, C., Bitter, T., Krcmar, H., Markl, V., & Hoeren, T. (2014). An interdisciplinary opportunity for information systems research. Business and Information Systems Engineering, 6(5), 261-266. https://doi.org/10.1007/s12599-014-0345-1
Singh, P. (2019). Machine learning with PySpark. Apress. https://doi.org/10.1007/978-1-4842-4131-8
Witten, I. H., Eibe, F., & Hall, M. A. (2017). Data mining: practical machine learning tools and techniques. Morgan Kaufmann.
