# RFM (Recency, Frequency, Monetary) Customer Segmentation in Python

The below script shows how we may handle RFM segmentation with Python. RFM stands for Recency, Frequency and Monetary:

• Recency: How many days since they last purchased
• Frequency: Total purchase count by customer
• Monetary: Total spend by customer

This method enables us to group our customers & target specific groups with advertisements or deals. The way we approach the task is:

• Calculate the quartiles for recency, frequency and monetary
• Segment those into quartiles, where quartile 2 and 3 are joined to become a ‘moderately engaged’ quartile
• Create an RFM score from the quartiles. So, if my recency is 1, frequency is 2 and monetary is 1, my score will be 121
• Using the scores, bucket the customers into specific groups
```import pandas as pd
path = 'Desktop/customer_features.csv'
#Calculate the percentile for recency, frequency and monetary
df['recencypct'] = df.recency.rank(pct=True)
df['freqencypct'] = df.frequency.rank(pct=True)
df['monencypct'] = df.monetary.rank(pct=True)
#bucket recency into quartiles
df.loc[df.recencypct <= 0.25, 'recencyqtl'] = 1
df.loc[df.recencypct > 0.75, 'recencyqtl'] = 3
df.loc[(df.recencypct > 0.25) & (df.recencypct <=0.75), 'recencyqtl'] = 2
#bucket frequency into quartiles
df.loc[df.freqencypct <= 0.25, 'frequencyqtl'] = 1
df.loc[df.freqencypct > 0.75, 'frequencyqtl'] = 3
df.loc[(df.freqencypct > 0.25) & (df.freqencypct <=0.75), 'frequencyqtl'] = 2
#bucket monetary into quartiles
df.loc[df.monencypct <= 0.25, 'monetaryqtl'] = 1
df.loc[df.monencypct > 0.75, 'monetaryqtl'] = 3
df.loc[(df.monencypct > 0.25) & (df.monencypct <=0.75), 'monetaryqtl'] = 2
#slimline dataframe and drop decimal points
df2 = df[['id', 'recencyqtl', 'monetaryqtl', 'frequencyqtl']].astype(int)
#calculate the recency, frequency, monetary score (RFM) and then count the number of customers in each group
df2['score'] = df2["recencyqtl"].map(str) + df2["frequencyqtl"].map(str) + df2["monetaryqtl"].map(str)
df3 = df2[['id', 'score']].groupby(['score']).agg(['count'])
#We have a possible 720 groups if we have 3 number scores. So, we need to reduce this
#if frequency and recency are low then we will say they're disengaged customers
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl == 1)& (df2.monetaryqtl == 1), 'Segment'] = 'Disengaged'
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl ==1)& (df2.monetaryqtl == 2), 'Segment'] = 'Disengaged Moderate Spenders'
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 1), 'Segment'] = 'Moderately Engaged Low Spenders'
df2.loc[(df2.recencyqtl == 1) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 2), 'Segment'] = 'Moderately Engaged Moderate Spenders'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==1)& (df2.monetaryqtl == 1), 'Segment'] = 'Recent Low Spenders'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==1)& (df2.monetaryqtl == 2), 'Segment'] = 'Moderately Engaged'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 1), 'Segment'] = 'Moderately Engaged'
df2.loc[(df2.recencyqtl == 2) & (df2.frequencyqtl ==2)& (df2.monetaryqtl == 2), 'Segment'] = 'Moderately Engaged'
df2.loc[(df2.recencyqtl == 3) | (df2.frequencyqtl ==3) | (df2.monetaryqtl == 3), 'Segment'] = 'Engaged'
df2.loc[(df2.Segment.isnull()), 'Segment'] = 'Other Segment'
#Now, let's count how many customers are in each of our new segments.
df4 = df2[['id', 'Segment']].groupby(['Segment']).agg(['count'])
#This is the output - a much more manageable 7 segments
count
Segment
Disengaged 21
Disengaged Moderate Spenders 35
Engaged 575
Moderately Engaged 245
Moderately Engaged Low Spenders 31
Moderately Engaged Moderate Spenders 65
Recent Low Spenders 28```

The below provides an example where we’ve used seaborn to plot the output:

```#import the required libraries
# - matplotlib is a charting library
# - Seaborn builds on top of Matplotlib and introduces additional plot types. It also makes your traditional Matplotlib plots look a bit prettier.
# - Numpy is numerical Python
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
#define dataframe and percentiles
df_plot = df[['imsi','audio','video']]
df_plot = df_plot.fillna(0)
df_plot['audiopercentile'] = df_plot.audio.rank(pct=True)
df_plot['videopercentile'] = df_plot.video.rank(pct=True)
#Add segment names based on percentile criteria
df_plot.loc[(df_plot.audiopercentile >= 0.75) & (df_plot.videopercentile >=0.75), 'Segment'] = 'Top 25% Both'
df_plot.loc[(df_plot.audiopercentile >= 0.75) & (df_plot.videopercentile <0.75), 'Segment'] = 'Top 25% Audio'
df_plot.loc[(df_plot.audiopercentile < 0.75) & (df_plot.videopercentile >= 0.75), 'Segment'] = 'Top 25% Video'
df_plot.loc[(df_plot.audiopercentile >= 0.5) & (df_plot.videopercentile <0.75), 'Segment'] = 'Top 50% Both'
df_plot.loc[(df_plot.audiopercentile >= 0.5) & (df_plot.audiopercentile < 0.75) & (df_plot.videopercentile <0.5), 'Segment'] = 'Top 50% Audio'
df_plot.loc[(df_plot.videopercentile >= 0.5) & (df_plot.videopercentile < 0.75) & (df_plot.audiopercentile <0.5), 'Segment'] = 'Top 50% Video'
df_plot.loc[(df_plot.videopercentile < 0.5), 'Segment'] = 'Bottom 50% Video'
df_plot.loc[(df_plot.audiopercentile < 0.5), 'Segment'] = 'Bottom 50% Audio'
df_plot.loc[(df_plot.audiopercentile >= 0.99), 'Segment'] = 'Top 1% Audio'
df_plot.loc[(df_plot.videopercentile >= 0.99), 'Segment'] = 'Top 1% Video'
#Do a count and group by to summarise how many users fall into each segment
df4 = df_plot[['Segment', 'imsi']].groupby(['Segment']).agg(['count'])
#The below statement, will enable us to visualise matplotlib charts, even in ipython
#Using matplotlib backend: MacOSX
%pylab
#plot
sns.scatterplot(df_plot["video"], df_plot["audio"], hue=df_plot["Segment"])```