Big data en el mundo del retail: segmentación de clientes y sistema de recomendación en una cadena de supermercados de Europa
Resumen
En esta investigación se presentan los conceptos y técnicas utilizados en un proyecto de big data para una compañía europea de supermercados. Se propuso la segmentación de clientes, utilizando el algoritmo k-medias, y un sistema de recomendación a través de la librería LightFM de Python. Entre las principales conclusiones, se puede indicar la importancia de definir adecuadamente el problema por resolver, el uso correcto de la infraestructura de big data, y la relevancia del análisis exploratorio del conjunto de datos y su preprocesamiento, así como la aplicación de la metodología de proyectos TDSP (Team Data Science Process), orientada a los proyectos de big data.
Descargas
Citas
Aryuni, M., Didik Madyatmadja, E., & Miranda, E. (2018). Customer segmentation in XYZ Bank using k-means and k-medoids clustering [Presentación de paper]. Proceedings of 2018 International Conference on Information Management and Technology, ICIMTech 2018. Jakarta, Indonesia. https://doi.org/10.1109/ICIMTech.2018.8528086
Cam, C., Hidalgo, G., Huérfano, C., & Medina, J. (2020). Memoria trabajo final. Máster en Big Data Engineer. Universidad de Barcelona.
Chen, D., Sain, S. L., & Guo, K. (2012). Data mining for the online retail industry: a case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing & Customer Strategy Management, 19(3), 197-208. https://doi.org/10.1057/dbm.2012.17
Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS Quarterly, 36(4), 1165-1188. https://doi.org/10.2307/41703503
Christodoulou, P., Christodoulou, K., & Andreou, A. S. (2017). A real-time targeted recommender system for supermarkets. En Proceedings of the 19th International Conference on Enterprise Information Systems. Volumen 2: ICEIS 2017 (pp. 703-712). https://doi.org/10.5220/0006309907030712
Doğan, O., Ayçin, E., & Bulut, Z. A. (2018). Customer segmentation by using RFM model and clustering methods: a case study in retail industry. International Journal of Contemporary Economics and Administrative Sciences, 8(1), 1-19. http://www.ijceas.com/index.php/ijceas/article/view/174
Falk, K. (2019). Practical recommender systems. Manning.
Fang, Y., Xiao, X., Wang, X., & Lan, H. (2018). Customized bundle recommendation by association rules of product categories for online supermarkets. En 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC) (pp. 472-475). https://doi.org/10.1109/DSC.2018.00076
Gulabani, S. (2017). Practical Amazon EC2, SQS, Kinesis, and S3: A hands-on to AWS. Apress. https://doi.org/10.1007/978-1-4842-2841-8
Kansal, T., Bahuguna, S., Singh, V., & Choudhury, T. (2018). Customer segmentation using k-means clustering. En Proceedings of the International Conference on Computational Techniques, Electronics and Mechanical Systems, CTEMS 2018 (pp. 135-139). https://doi.org/10.1109/CTEMS.2018.8769171
Kumar, V., & Reinartz, W. (2018). Customer relationship management: concept, strategy, and tools (3.a ed.). Springer. https://doi.org/10.1108/IJBM-11-2014-0160
Lycett, M. (2013). “Datafication”: making sense of (big) data in a complex world. European Journal of Information Systems, 22(4), 381-386. https://doi.org/10.1057/ejis.2013.10
Microsoft. (2021, 11 de diciembre). What is team data science process? https://docs.microsoft.com/en-us/azure/architecture/data-science-process/overview
Pascal, C., Ozuomba, S., & Kalu, C. (2015). Application of k-means algorithm for efficient customer segmentation: a strategy for targeted customer services. International Journal of Advanced Research in Artificial Intelligence, 4(10), 40-44. https://doi.org/10.14569/ijarai.2015.041007
Pérez, C. (2013). Análisis multivariante de datos. Aplicaciones con IBM SPSS, SAS y STATGRAPHICS (1.a ed.). Garceta.
Schermann, M., Hemsen, H., Buchmüller, C., Bitter, T., Krcmar, H., Markl, V., & Hoeren, T. (2014). An interdisciplinary opportunity for information systems research. Business and Information Systems Engineering, 6(5), 261-266. https://doi.org/10.1007/s12599-014-0345-1
Singh, P. (2019). Machine learning with PySpark. Apress. https://doi.org/10.1007/978-1-4842-4131-8
Witten, I. H., Eibe, F., & Hall, M. A. (2017). Data mining: practical machine learning tools and techniques. Morgan Kaufmann.