Customer segmentation and association-rule analytics
This project combines RFM customer segmentation with market basket analysis to connect unsupervised learning outputs to practical customer and product decisions.
Challenge
- Retail transaction data needs cleaning before customer behavior becomes meaningful.
- Customer segmentation and basket analysis answer different business questions.
- Analytical outputs need interpretation that maps clusters and rules to practical actions.
System architecture
Data and inputs
UCI Online Retail data with 541,909 raw rows, 25,900 invoices, 4,372 customers, and 4,070 unique products.
Technical approach
- Clean missing customer IDs, cancellations, invalid quantities, and invalid prices.
- Create Recency, Frequency, and Monetary features and compare clustering approaches.
- Build a France transaction matrix and run Apriori association-rule mining.
- Interpret segments and rules as retention, reactivation, cross-selling, and loyalty opportunities.
Evaluation and results
541,909 raw rows
3 customer segments
23 final association rules
- K-Means produced three customer segments with silhouette 0.4599.
- The final setup identified Regular, Dormant/At-Risk, and VIP/High-Value customers.
- Market basket analysis produced 23 final rules after support, confidence, and lift filtering.
Implementation and code
Implementation focus
The implementation connects data preparation, modeling, evaluation, and interpretation in a structured workflow that makes the technical decisions clear.
Source code
The code is available for exploring the implementation details and extending the experiment when needed.
Scope and responsible use
The project is a focused modeling and evaluation study. Broader use should be supported by validation on additional data, robustness checks, monitoring, and domain-specific evaluation.
Future development
- Compare additional clustering methods and stability checks.
- Add cohort analysis and customer lifetime value features.
- Turn rules into ranked recommendation candidates with clearer business constraints.
Technical contribution
The project connects unsupervised learning with business interpretation across customer behavior and product relationships.