Prophet
Prophet is a time series forecasting technique which has been developed in house at Facebook. It is based on an additive model comprising a non-linear (piecewise linear or logistic) trend, seasonality and holiday effects. There is also support to add other regressors . The additive model has the following components:
(1) g(t) : Trend. Prophet has the option of two trend models : linear and logistic. These can be specified by the growth parameter.
Logistic Growth or Saturating Growth uses the logistic growth model for saturating growth. For this model to be used, the cap parameter has to be set which provides the upper bound for the growth model. Prophet also adds changepoints to the logistic growth model which is described in the next section.
Linear Growth : Prophet uses a piecewise linear model to model linear growth. Trend changes are captured by explicitly defining changepoints where growth rate is allowed to change. Prophet uses a bayesian model with a Laplacian Prior to control the flexibility of the changepoint parameter. The following parameters can be set in kats:
- changepoints : List of dates at which to include potential changepoints. If not specified, changepoints are detected automatically
- n_changepoints : The number of potential changepoints to include. Only used if changepoints are not specified. If changepoints are not specified, n_changepoints changepoints are selected uniformly from first changepoint_range proportion of the history
- changepoint_range : Proportion of history to estimate trend changepoints. Defaults to 80%
- changepoint_prior_scale : Parameter which controls the flexibility of the automatic changepoint selection. For large values, we get many changepoints. This is the most important parameter when forecasting with Prophet
(2) s(t) : Seasonality. Prophet uses a Fourier series to estimate seasonality. The following parameters can be set in Kats :
- yearly_seasonality : If set to ‘auto’, it is automatically determined. A boolean can be set to specify whether to include or exclude yearly seasonality. One can also specify the number of Fourier terms to generate. (Seasonality for different months of the year)
- weekly_seasonality : Same as yearly_seasonality, but for weekly seasonality ( Seasonality for different days of the week)
- daily_seasonality : Same as yearly and weekly seasonality but for daily ( Seasonality for different hours/minutes of the day)
- seasonality_mode : One can specify whether to include additive or multiplicative seasonality
- seasonality_prior_scale : This parameter controls the strength of the seasonality model. Larger values allow the model to fit larger seasonal fluctuations while smaller values dampen the seasonality.
(3) h(t) : Holidays. Incorporating holidays can be extremely beneficial and improve the quality of the forecasts since these are not captured by either trend or seasonality, but can disrupt the estimation for trend and seasonality. The parameters used for holidays in Kats are :
- holidays : A set of holidays as an input to prophet. This is a pandas dataframe with columns holiday (string) and ds ( date type) and optionally columns lower_window and upper_Window which can specify a range of days around the date to be included as holidays. Also, optionally can have a column prior_scale specifying the prior_scale for that holiday
- holidays_prior_scale : Parameter modulating the strength of the holiday components model, unless overridden in the previous parameter.
The model therefore is represented as follows [Image: Screen Shot 2020-05-05 at 12.10.09 PM.png]
Forecasting Uncertainty
Prophet models uncertainty in the trend by extending the generative model for trend forward. The change point prior for future change points are estimated by the variance inferred from the data. Future changepoints are randomly sampled and these are added to the uncertainty intervals. The following parameters can tune the uncertainty intervals in kats :
- interval_width : This is a float between 0 and 1 to determine the width of the uncertainty intervals provided for the forecast. If mcmc_samples = 0, only uncertainty in the trend is calculated while if it is greater than 0, a full Bayesian inference is done to compute uncertainty in all model parameters including seasonality
- uncertainty_samples : Number of simulated draws used to estimate uncertainty intervals. If this value is set to 0, uncertainty estimation is disabled.
Bayesian Inference
Prophet uses Stan’s L-BFGS to find a maximum a posteriori estimate using the model specified by the various parameters. However the parameter mcmc_samples can be set to be greater than 0 to do a full posterior inference to include model parameter uncertainty in the forecast uncertainty. This would increase compute time but give more accurate uncertainty bounds
API
# Parameter class
class ProphetParams(Params):
def __init__(
self,
growth="linear",
changepoints=None,
n_changepoints=25,
changepoint_range=0.8,
yearly_seasonality="auto",
weekly_seasonality="auto",
daily_seasonality="auto",
holidays=None,
seasonality_mode="additive",
seasonality_prior_scale=10.0,
holidays_prior_scale=10.0,
changepoint_prior_scale=0.05,
mcmc_samples=0,
interval_width=0.80,
uncertainty_samples=1000,
cap=None,
)
Parameters:
growth : "linear" or "logistic"
changepoints : list of predefined changepoints(dates). None by default
n_changepoints : number of changepoints
changepoint_range : Proportion of initial history to look for changepoints
yearly_seasonality : Specify yearly seasonality as true or false or
set the number of fourier terms for yearly seasonality
weekly_seasonality : Specify weekly seasonality
daily_seasonality : Specify daily seasonality
holidays : List of holidays to include in the forecast
seasonality_mode : "additive" or "multiplicative"
seasonality_prior_scale : Prior to detect seasonality
holidays_prior_scale : Prior to detect holidays
changepoint_prior_scale : Prior to detect changepoint
mcmc_samples : > 0 for Full Posterior Inference. (1000 recommended)
interval_width : Probability width of Uncertainty Interval
uncertainty_samples : Number of simulated draws to estimate uncertainty intervals
cap : Upper limit for Logistic Growth
Methods
fit(): # fit Prophet model with given parameters
predict(steps, freq): # predict the future for future steps
plot(): # plot the time series data with confidence internal (if exist)
for logistic growth the predict function sets the cap from the params
Example
We use air passenger data as an example for Prophet
import pandas as pd
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.prophet import ProphetModel, ProphetParams
# read data and rename the two columns required by TimeSeriesData structure
data = pd.read_csv("air_passengers.csv")
data.columns = ["time", "y"]
TSdata = TimeSeriesData(data)
# create ProphetParam with specifying initial param values
params = ProphetParams(growth='linear', seasonality_mode='multiplicative')
# create ProphetModel with given data and params
m = ProphetModel(data=TSdata, params=params)
# call fit method to fit model
m.fit()
# call predict method to predict the next 30 steps
m.predict(steps=30, freq='MS')