Forecasting Time Series data with Prophet – Part 3

This is the third in a series of posts about using Prophet to forecast time series data. The other parts can be found here:

In those previous posts, I looked at forecasting monthly sales data 24 months into the future.   In this post, I wanted to look at using the ‘holiday’ construct found within the Prophet library to try to better forecast around specific events.  If we look at our sales data (you can find it here), there’s an obvious pattern each December.  That pattern could be for a variety of reasons, but lets assume that its due to a promotion that is run every December.   You can see the chart and pattern in the chart below.

sales data plot
Sales Data – Note the spike every December

Prophet allows you to build a holiday‘ dataframe and use that data in your modeling.  For the purposes of this example, I’ll build my prophet holiday dataframe in the following manner:

promotions = pd.DataFrame({
  'holiday': 'december_promotion',
  'ds': pd.to_datetime(['2009-12-01', '2010-12-01', '2011-12-01', '2012-12-01',
                        '2013-12-01', '2014-12-01','2015-12-01']),
  'lower_window': 0,
  'upper_window': 0,
})

This promotions dataframe consisists of promotion dates for Dec in 2009 through 2015, The lower_window and upper_window values are set to zero to indicate that we don’t want prophet to consider any other months than the ones listed.

Now that I have my promotions dataframe ready to go, I’ll run through the modeling quickly (you can check out the jupyter notebook for more details):

sales_df = pd.read_csv('../examples/retail_sales.csv', index_col='date', parse_dates=True)
df = sales_df.reset_index()
df=df.rename(columns={'date':'ds', 'sales':'y'})
df['y'] = np.log(df['y'])
model = Prophet(holidays=promotions)
model.fit(df);
future = model.make_future_dataframe(periods=24, freq = 'm')
forecast = model.predict(future)
model.plot(forecast);

With these steps, we’ve loaded the data, set it up the way prophet expects and ran our model with the promotions data and then plotted the model, which looks like the following:

Sales Data Modeled with Holidays
Sales Data Modeled with Holidays

Given that we have such little data, I doubt the use of holidays will make that much difference in the forecasts, but its a good example to use.  We can check the difference in the model with holidays vs the model without by re-running the prophet forecast without holidays and see that the average difference between the two is ~ 0.06%…which isn’t terribly large, but still worth investigating.  The jupyter notebook that accompanies this post goes into much more detail on this aspect (as well as the overall analysis).

Note: You can find the full code for this post in a Jupyter notebook here:

 

 

Eric D. Brown , D.Sc. has a doctorate in Information Systems with a specialization in Data Sciences, Decision Support and Knowledge Management. He writes about utilizing python for data analytics at pythondata.com and the crossroads of technology and strategy at ericbrown.com

The post Forecasting Time Series data with Prophet – Part 3 appeared first on Python Data.