Big data in the retail world: customer segmentation and recommender system in a European supermarket chain

Authors

DOI:

https://doi.org/10.26439/ing.ind2022.n.5808

Keywords:

retail, segmentation, recommender system, big data, machine learning

Abstract

In this research, we want to present the concepts and techniques used in a big data project for a European supermarket company, through a customer segmentation proposal, using the k-means algorithm, and a recommender system, via Light FM library. The main conclusions include the importance of adequately defining the problem to be solved, the correct use of the big data infrastructure, the relevance of the exploratory analysis of the dataset and its pre-processing, as well as the use of the TDSP methodology (Team Data Science Process), oriented to big data projects.

Downloads

Download data is not yet available.

Author Biography

  • César Rogelio Cam Gensollen, Universidad de Lima, Facultad de Ingeniería y Arquitectura, Lima, Perú

    Candidato a doctor en Business Administration por la Universidad ESAN. Máster en Big Data Engineer por la Universidad de Barcelona. Magíster en Investigación en Ciencias de la Administración por la Universidad ESAN. MBA por la Universidad Peruana de Ciencias Aplicadas. Ingeniero industrial por la Universidad de Lima. Asimismo, completó la diplomatura de Estudio en Estadística Aplicada por la Pontificia Universidad Católica del Perú. Posee especializaciones en Dirección Comercial y Liderazgo por el PAD de la Universidad de Piura, así como un diploma de especialización en Big Data & Analytics por la Universidad Nacional de Ingeniería. Ha complementado su formación académica certificándose en Scrum Máster e ITIL 4. Cuenta con más de treinta años de experiencia profesional y ha sido director de importantes empresas del sector B2B.

References

Aryuni, M., Didik Madyatmadja, E., & Miranda, E. (2018). Customer segmentation in XYZ Bank using k-means and k-medoids clustering [Presentación de paper]. Proceedings of 2018 International Conference on Information Management and Technology, ICIMTech 2018. Jakarta, Indonesia. https://doi.org/10.1109/ICIMTech.2018.8528086

Cam, C., Hidalgo, G., Huérfano, C., & Medina, J. (2020). Memoria trabajo final. Máster en Big Data Engineer. Universidad de Barcelona.

Chen, D., Sain, S. L., & Guo, K. (2012). Data mining for the online retail industry: a case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing & Customer Strategy Management, 19(3), 197-208. https://doi.org/10.1057/dbm.2012.17

Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS Quarterly, 36(4), 1165-1188. https://doi.org/10.2307/41703503

Christodoulou, P., Christodoulou, K., & Andreou, A. S. (2017). A real-time targeted recommender system for supermarkets. En Proceedings of the 19th International Conference on Enterprise Information Systems. Volumen 2: ICEIS 2017 (pp. 703-712). https://doi.org/10.5220/0006309907030712

Doğan, O., Ayçin, E., & Bulut, Z. A. (2018). Customer segmentation by using RFM model and clustering methods: a case study in retail industry. International Journal of Contemporary Economics and Administrative Sciences, 8(1), 1-19. http://www.ijceas.com/index.php/ijceas/article/view/174

Falk, K. (2019). Practical recommender systems. Manning.

Fang, Y., Xiao, X., Wang, X., & Lan, H. (2018). Customized bundle recommendation by association rules of product categories for online supermarkets. En 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC) (pp. 472-475). https://doi.org/10.1109/DSC.2018.00076

Gulabani, S. (2017). Practical Amazon EC2, SQS, Kinesis, and S3: A hands-on to AWS. Apress. https://doi.org/10.1007/978-1-4842-2841-8

Kansal, T., Bahuguna, S., Singh, V., & Choudhury, T. (2018). Customer segmentation using k-means clustering. En Proceedings of the International Conference on Computational Techniques, Electronics and Mechanical Systems, CTEMS 2018 (pp. 135-139). https://doi.org/10.1109/CTEMS.2018.8769171

Kumar, V., & Reinartz, W. (2018). Customer relationship management: concept, strategy, and tools (3.a ed.). Springer. https://doi.org/10.1108/IJBM-11-2014-0160

Lycett, M. (2013). “Datafication”: making sense of (big) data in a complex world. European Journal of Information Systems, 22(4), 381-386. https://doi.org/10.1057/ejis.2013.10

Microsoft. (2021, 11 de diciembre). What is team data science process? https://docs.microsoft.com/en-us/azure/architecture/data-science-process/overview

Pascal, C., Ozuomba, S., & Kalu, C. (2015). Application of k-means algorithm for efficient customer segmentation: a strategy for targeted customer services. International Journal of Advanced Research in Artificial Intelligence, 4(10), 40-44. https://doi.org/10.14569/ijarai.2015.041007

Pérez, C. (2013). Análisis multivariante de datos. Aplicaciones con IBM SPSS, SAS y STATGRAPHICS (1.a ed.). Garceta.

Schermann, M., Hemsen, H., Buchmüller, C., Bitter, T., Krcmar, H., Markl, V., & Hoeren, T. (2014). An interdisciplinary opportunity for information systems research. Business and Information Systems Engineering, 6(5), 261-266. https://doi.org/10.1007/s12599-014-0345-1

Singh, P. (2019). Machine learning with PySpark. Apress. https://doi.org/10.1007/978-1-4842-4131-8

Witten, I. H., Eibe, F., & Hall, M. A. (2017). Data mining: practical machine learning tools and techniques. Morgan Kaufmann.

Published

2022-04-22

Issue

Section

Artículos

How to Cite

Big data in the retail world: customer segmentation and recommender system in a European supermarket chain. (2022). Ingeniería Industrial, 189-216. https://doi.org/10.26439/ing.ind2022.n.5808