Are there any examples of using this beyond the documentation?
Hi Martin, this is effectively a way to forecast by cohort but to save space in the model by not applying the Cohort
Category itself. There are no differences in the outputs but you do lose the ability to track by individual Cohort and you have to treat every Cohort the same. Hope that helps you to understand the formula!
Here is an example of how the function operates:
This function can be used to forecast the total (aggregated) value of existing cohorts, without the need of the cohort dimension in the model.
It has the following parameters:
- Data at last time step (Array)
- Retention Rate (Array)
- Last data date (date or timestep index)
- timeStep helper variable
- Additional Churn (Array, Optional)
Example
Let’s say we have historical data for the number of Customers by cohort:
We transpose the last column to a row (the number of customers at the last timestep)
We also have an assumption for the retention rate:
(50% of new customers stay after 1 month, 70% of customers in their first month stay in the second month and so on)
Now we can use the flat_cohort_forecast function to forecast the total number of customers of the existing cohorts in future months:
The function calculates the total number of remaining customers of the cohorts Jun-21 to Oct-21:
How the function works
It applies the corresponding retention rates to the cohorts:
- In Nov
- Oct cohort: The 112 customers of the Oct cohort are in their 1st month - so the number is multiplied by the retention rate of the first month (50%)
- Sep cohort: The 83 customers of the Sep cohort are in their 2nd month - so the number is multiplied by the retention rate of the 2nd month (70%)
- Aug cohort: The 61 customers of the Aug cohort are in their 3rd month - so the number is multiplied by the retention rate of the 3rd month (80%)
- Result: 112 * 50% + 83 * 70% + 61 * 80% = 162.9
- In Dec, two retention rates have to be applied
- Oct cohort: 112 * 50% customers remain in Nov and are now in their second month - so the number is multiplied by the retention rate of the second month (70%)
- Sep cohort: 83 * 70% customers remain in Nov and are now in their third month - so the number is multiplied by the retention rate of the third month (80%)
- Aug cohort: 61 * 80% customers remain in Nov and are now in their fourth month - so the number is multiplied by the retention rate of the fourth month (90%)
- Result: 112 * 50% * 70% + 83 * 70% * 80% + 61 * 80% * 90% = 129.6
Additional Churn
Let’s say you plan to increase your prices in Nov 21 and expect this to cause an additional churn of 10%. To include this in your forecast, just create a new variable for the additional churn and use it as the 5th argument in the flat_cohort_forecast function:
Now all cohorts will be multiplied by 100% - 10% = 90% in Nov. This cannot be included in the Retention rate variable because different cohorts will be in different months when this additional churn is applied (In Nov, the Oct cohort is in their 1st month while the Sep cohort is in their 2nd month)