ARIMA
ARIMA model (stand for Auto Regressive Integrated Moving Average) is a classical statistical model for time series data. It contains three main components from its name
- AR, Auto Regressive, means the variable of interest (time series) is regressed on its own lagged values
- MA, Moving Average, means the regression error is a linear combination of error terms whose values occurred contemporaneously and at various times in the past
- I, Integrated, means data values have been replaced with the difference between their values and the previous value
We use the implementation in statsmodels and re-write the API to adapt Kats development style.
API
# Parameter class
class ARIMAParams(p, d, q, exog=None, dates=None, freq=None)
Parameters:
p: the order of AR terms
d: the number of differencing to make the time series stationary
q: the order of MA terms
exog: optional, exogenous variables
dates: optional, pandas-compatible datetime object
freq: optional, frequency of a given time series
# Model class
class ARIMAModel(data, params)
Methods
fit(): # fit ARIMA model with given parameters
predict(steps, freq): # predict the future for future steps
plot(): # plot the time series data with confidence internal (if exist)
Example
We use air passenger data as an example for ARIMA model
import pandas as pd
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.arima import ARIMAModel, ARIMAParams
# read data and rename the two columns required by TimeSeriesData structure
data = pd.read_csv("../data/example_air_passengers.csv")
data.columns = ["time", "y"]
TSdata = TimeSeriesData(data)
# create ARIMAParam with specifying initial param values
params = ARIMAParams(p=1, d=1, q=1)
# create ARIMAModel with given data and params
m = ARIMAModel(data=TSdata, params=params)
# call fit method to fit model
m.fit()
# call predict method to predict the next 30 steps
m.predict(steps=30, freq="MS")
# visualize the results
m.plot()