How do I use the 'spread' function in Causal?

Spread is a powerful and flexible Causal function. It allows you to spread a value, or a set of values, over a time period. You can think of it as similar in functionality to sumproduct in Excel, but more intuitive (and of course, more powerful) :zap:

Spread has two arguments: “x” - the value or set of values, and “y” - the amount you’re spreading x by.

For example, if all new customers have to pay a $100 sign-up fee in their first month, and from the second month onwards they pay $10 per month for their subscription - you could calculate your monthly revenue using spread.

  • In this instance, your “x” would be your New Customers each month, and the “y” amount you’re spreading it by would be that payment structure.
  • In Causal, this would look as follows.

How does it actually work? What’s the underlying calculation? :thinking:

  • In essence: we are spreading all new customers to-date, by the payment structure over time.

  • In the first month, Oct 2021, there are 100 new customers who all pay their $100 sign-up fee. $100 x 100 customers = $10k. Simple.

  • In the second month, Nov 2021, there are 200 new customers who pay their $100 sign-up fee, and 100 existing customers who pay their $10 per month ongoing subscription. $100 x 200 customers = $20k, $10 x 100 customers = $1k. $20k + $1k = $21k.

  • In the third month, Dec 2021, there are 300 new customers who pay their $100 sign-up fee, and now 300 existing customers (100 from Oct, 200 from Nov) who pay $10 per month for their ongoing subscription. $100 x 300 customers = $30k, $10 x 300 customers = $3k. $30k + $3k = $33k.

  • And so on…

It is important to note that the Spread function expects both arguments (x and y) to be spans. Spans are a range/set of values.

  • To make both arguments spans, you’ll need to change the timestep that each argument references, from ‘Current’ (the default), to ‘0:t’. 0 represents the 1st timestep and t represents the current timestep. For more info on spans and time in general, see here
    ezgif.com-gif-maker

    • Note: ‘x’, the value/set of values, aligns with the months in the model, however ‘y’, the amount you’re spreading ‘x’ by, aligns with the timesteps in the model (so instead of October 2021 it is time step 0. Therefore every new month of customers starts with payments at timestep 0). Said differently, for the payment variable in the Causal screenshot above, you can ignore the months across the top and think of them as 0, 1, 2 etc.

Other common use cases of Spread are:
:dollar: Sales cycles (where ‘x’ is leads generated, and ‘y’ is the % that close over time)
:chart_with_downwards_trend: Retention curves (where ‘x’ is new users, and ‘y’ is the % that retain over time).

If you have any additional use cases you’ve used spread for - comment on this post to share them :rocket: And of course, if you have any questions, reach out to us, either via this post, or on our live chat. Happy modelling! :metal:

6 Likes

Ramping Usage Rates. Suppose you sell a device to a client for which you charge by usage. You have created a “learning curve” showing the average usage of the device month by month once it is installed. By using Spread you can apply the learning curve to each installed device as follows: Spread (Learning Curve 0:t, New Installations 0:t ). The result is the time series of usage for all the devices, taking into account the learning curve at each installation. This could be done in Excel, but it would be ugly.

1 Like

In your example, suppose the revenue stream from the customer was $100 the first month, then $10 for 12 months and then 0 thereafter. You could use a variable with 100, 10, 10, … 0, 0, … and designate the span as “0:t”. Could you also designate it as “0:12” to pick up just the positive values, or “0:20” to make sure you got them all? Would this reduce the overhead of the spread calculation as opposed to using “0:t”? The customer count variable would still be designated “0:t”.