Data-Driven Customer Retention for SMEs: Predicting
Repeat Purchase and Customer Value
Ilknur Ozturk1,∗
1Faculty of Economics, Administrative and Social Sciences, Nisantasi University, Istanbul, Turkey
Email: ilknur.ozturk@nisantasi.edu.tr
Abstract
The strategic importance of customer retention in small and medium-sized enterprises (SMEs) is due to the fact that the
resources are limited, and the indiscriminate customer acquisition and customer retention campaigns are economically
inefficient. However, the descriptive reporting used by many SMEs does not have the advantages of transactiondriven
analytics that allows differentiating between high-value and low-yield customer relationships. This paper creates
a repli-cable customer-analytics pipeline in SME-type retail environments, using publicly available transactional data.
In con-trast to the macro-level forecasting research, the paper integrates customer value segmentation with the futureoriented
repeat-purchase prediction and translates the results into retention actions explicitly. The customer-level
features were based on invoices, quantities, prices, product variety, and return behavior and were derived using the
public Online Retail dataset. Observation windows on a monthly were transformed into a repeat-purchase 90-day
problem. Three predictive models—logistic regression, random forest, and gradient boosting—were compared after
customer segmentation based on recency, frequency, and monetary behavior. The findings indicate that random forest
model had the highest discrimination (ROC-AUC = 0.750; PR-AUC = 0.821), followed by logistic regression, which
was only slightly less than it and more interpretable. Segment analysis also showed a very concentrated revenue base
with Champions having 27.5 percent of the customers but 67.2 percent of recent revenue and 81.0 rate of repeat
purchasing. The paper provides a submission-ready, transparently reproducible, and managerially understandable
design that is particularly applicable in SMEs that want low-cost retention analytics, customer ranking, and allocation
of marketing resources.
Keywords: Customer retention; SMEs; Repeat purchase; Customer value; Business data analytics; Retail analytics