In a subscription based revenue model, the rate at which customers cancel (churn) is hugely important to understanding future revenue. Customer churn and profit margin are actually how we estimate Customer Lifetime Value.

\[ CLTV = \frac{1}{Churn\ Rate} \cdot ARPA \cdot Gross\ Margin \]

Survival Curves are a great way of both estimating customer churn and visualizing customer lifetime. In this example we’ll be using a fictional dataset from a telecom provider (e.g. AT&T, T-Mobile, etc…), based off of a real scenario I encountered in 2018 determining that customers from different marketing channels can make or break the profitability of a promotional pricing offer.

Setup

The premise was to investigate churn rates among our customer base and provide insight into why customers might be cancelling sooner than usual. In 2018, we offered a six months of free service promotion to attract new customers. As with most promos, they typically don’t bring in the highest quality customers, but the hope is that the cost of the promo is outweighed by the net revenue from the newly gained subscribers.

In this example, I segmented 2018 recruits into three cohorts:

  1. Subscribers who came from standard marketing channels (Website, Paid Ads, etc..) with no promotion
  2. Subscribers who came from standard marketing channels with the “Six Months of Free Service” promo
  3. Subscribers who came from a promo aggregator site “SketchyDeals.com” with the promo

Normally, this would be a decently complicated SQL query, but for this example we’ll be using a fictional dataset summarized below.

##        id                          ref        activation_date
##  Min.   :   1   No Promo             :673   7/25/2018 :  13  
##  1st Qu.: 497   Promo (Sketchy Deals):651   8/11/2018 :  13  
##  Median : 993   Promo (Standard)     :661   2/23/2018 :  12  
##  Mean   : 997                               4/16/2018 :  12  
##  3rd Qu.:1498                               10/13/2018:  11  
##  Max.   :2000                               11/3/2018 :  11  
##                                             (Other)   :1913  
##   deactivation_date       status          days      
##  6/24/2019 :1000    ACTIVATED:1000   Min.   :  1.0  
##  2/21/2019 :   8    CANCELLED: 985   1st Qu.:184.0  
##  5/8/2019  :   7                     Median :242.0  
##  1/29/2019 :   6                     Mean   :264.9  
##  10/6/2018 :   6                     3rd Qu.:362.0  
##  11/14/2018:   6                     Max.   :539.0  
##  (Other)   : 952

Survival Curves

Using the Survival package, we can approximate the percent remaining of each cohort at any point in time. The trick here is to have enough data and enough time elapsed to make a smooth curve or else you’ll end up with a step function. With 1,985 data points going back to the beginning of 2018, we should be fine. In the chart below, you’ll see the three cohorts color coded.

Subscribers from the standard “No Promo” channel have the highest survival rate after one year, but had we run this analysis after only a few months, we might have been tempted to think “No Promo” was the worst performing cohort. However, as soon as the six months of free service is up, customers start cancelling, especially from the Sketchy Deals cohort. However, we’ll want to use a statistical test to verify the trends we’re seeing aren’t just due to random chance.

library(survival)
sd$survival <- Surv(sd$days, sd$status == "CANCELLED")
fit <- survfit(survival ~ 1 + ref, data = sd)
plot(fit, mark.time = FALSE, ylim=c(0,1), xlim = c(0,400), col=c("blue","red","green"), lwd=1, xlab = 'Days since Subscribing', ylab = 'Percent Remaining',xaxs="i",yaxs="i")
axis(side =4)
legend(5, .35, c('No Promotion', 'Promo (Sketchy Deals)', 'Promo (Standard)'), col=c("blue","red","green"), lwd=1.5, bty = 'n', ncol = 1)
title(main = 'Six Months Free Service Promo Survival Curve')
abline(v=184, col="black", lty = 3, lwd = 2)
legend(6, .15, 't = 6 Months', lty=3, lwd=2, col='black', bty = 'n', ncol = 1)

Cox Regression and Interpretation

Although three factors is pretty easy to interpret in a graph, if we were exploring several other variables such as gender and age, we might want to let a computer do the heavy lifting in interpreting the findings. Below we’ll use a Cox Proportional Hazards Regression Model, designed for the effects of categorical variables on time dependent data. Without getting too into the weeds (more info avialable here), Cox Regression models the “hazard rate” of each group over time, allowing us to compare the chrun likelihood of a subscriber in each cohort.

# Subset data to compare control (No Promo) vs each test factor
sd_standard <- subset(sd, sd$ref != "Promo (Sketchy Deals)")
sd_standard$ref <- factor(sd_standard$ref)
sd_sketchy <- subset(sd, sd$ref != "Promo (Standard)")
sd_sketchy$ref <- factor(sd_sketchy$ref)

# Cox Regression for No Promo vs Standard Promo
coxph(survival ~ ref, data = sd_standard)
## Call:
## coxph(formula = survival ~ ref, data = sd_standard)
## 
##                        coef exp(coef) se(coef)    z        p
## refPromo (Standard) 0.34811   1.41639  0.09114 3.82 0.000134
## 
## Likelihood ratio test=14.79  on 1 df, p=0.0001203
## n= 1334, number of events= 495

First you’ll notice that the Cox Regression coefficient is significant at p < 0.001, suggesting the survival rate differences between getting the Standard Promotion vs No Promo are not due to chance. Furthermore, we can interpret the exp(coef)=1.41639 to mean that on average the Standard Promotion cohort is 1.4 times as likely to churn.

# Cox Regression for No Promo vs Standard Promo
coxph(survival ~ ref, data = sd_sketchy)
## Call:
## coxph(formula = survival ~ ref, data = sd_sketchy)
## 
##                            coef exp(coef) se(coef)     z      p
## refPromo (Sketchy Deals) 1.1163    3.0535   0.0835 13.37 <2e-16
## 
## Likelihood ratio test=198.1  on 1 df, p=< 2.2e-16
## n= 1324, number of events= 698

Similarly in the output above we see the Sketchy Deals cohort is 3 times as likely to churn as the No Promo group. This is great data for an analyst, but how do we wrap this up into a business impact?

Business Impact

Let’s first calculate average monthly churn rates for each cohort, since subscribers of this service are billed monthly. We’ll do this by dividing \(1-survival\ rate\) at point \(t\) by the number of month periods.

# Average Number of Days per Month in 2018
avg_days = 30.42
sf <- data.frame(fit$surv)
sf_1 <- sf[1:390,]
sf_2 <- sf[391:709,]
sf_3 <- sf[710:1064,]
np_churn = (1-sf_1[avg_days*10])/10
p1_churn = (1-sf_2[avg_days*10])/10
p2_churn = (1-sf_3[avg_days*10])/10

Now we’ll make some financial assumptions. Let’s assume we charge our customers $30/month for cell phone service and our gross margin is around 33%, meaning that every customer brings us ~$10 profit per month. Then we can calculate customer lifetime value as: \[ CLTV = \frac{1}{Churn\ Rate} \cdot ARPA \cdot Gross\ Margin = \frac{1}{3.55\%} \cdot 30 * 33\% = 281.70 \]

Each non-promotional customer is therefore expected to bring in around $282 over the course of their liftime. However, for promotional customers from standard marketing channels, that number drops down to $201 and even further to $121 for those originating from the SketchyDeals.com.

Conclusion

Survival Analysis has helped us uncover the hidden costs of running a promotion: increased churn. The customer lifetime value from each cohort as follows:

So if marketing and promotion costs per acquired customer end up adding up to higher than our CLTV, we would probably recommend to stop offering the promotion. In this particular example, marketing aside, we know the operating costs are $20 per month and the opportunity cost is $10 per month, meaning the promotion costs up to $180 in total depending on how canniablistic the promotion is to our current customer funnel. At $180 we would determine that the promotional customers acquired through standard marketing channels are slightly profitable, but those acquired from SketchyDeals.com are not.