Time Series Decomposition
Time Series decomposition is a classical time series technique to individual terms of trend, seasonality and residual terms. If we denote the time series as Y, and the trend, seasonality and residual component as T, S and R respectively, we can define two types of decomposition:
- Additive : Y[t] = T[t] + S[t] + R[t]
- Multiplicative : Y[t] = T[t] x S[t] x R[t]
In Kats, we implement to different types of Time Series Decomposition methods which are available in statsmodels as well. Those are
- seasonal_decompose : This method uses moving averages to detect the trend and the period of the series as the seasonality. We only provide additive or multiplicative decomposition here. No other parameters can be set for now
- STL : This algorithm uses LOESS (locally estimated scatterplot smoothing) to extract estimates of the three components. More details of the algorithm can be found in this paper : *STL : A Seasonal-Trend Decomposition Procedure based on Loess, Cleveland et. al., Journal of Official Statistics, Vol 6, NO. 1 1990 pp 3-73. *In kats, we provide all the parameters which are provided by statsmodels which can be added to kwargs listed below. All of the
- period : Periodicity of the sequence. If None, the algorithm will try and determine the period
- **seasonal **: Length of the seasonal smoother. Defaults to 7 and it should be an odd integer
- trend : Length of the trend smoother. Should be an odd integer.
- low_pass : Length of the low-pass filter. Must be an odd integer >= 3
- seasonal_deg : Degree of seasonal LOESS. 0 ( constant) or 1 (constant and trend)
- trend_deg : Degree of trend LOESS. 0 (constant) or 1 (constant and trend).
- low_pass_deg : Degree of low pass LOESS. 0 (constant) or 1 (constant and trend).
- **robust **: Flag indicating whether to use a weighted version that is robust to some forms of outliers.
- seasonal_jump : Positive integer determining the linear interpolation step. If larger than 1, the LOESS is used every seasonal_jump points and linear interpolation is between fitted points. Higher values reduce estimation time.
- trend_jump : Positive integer determining the linear interpolation step. If larger than 1, the LOESS is used every trend_jump points and values between the two are linearly interpolated. Higher values reduce estimation time.
- low_pass_jump : Positive integer determining the linear interpolation step. If larger than 1, the LOESS is used every low_pass_jump points and values between the two are linearly interpolated. Higher values reduce estimation time.
API
# Time Seriesn Decomposition class
class TimeSeriesDecomposition(data, decomposition, method, **kwargs)
def __init__(
self, data: TimeSeriesData, decomposition="additive", method="STL", **kwargs
)
data : TimeSeriesData object representing the time series
decomposition : 'additive' or 'multiplicative' decomposition
method = 'STL' or 'seasonal_decompose'
kwargs = a dictionary with other parameters for STL or seasonal_decompose
Methods
decomposer(): # fit ARIMA model with given parameters
plot(): # plot the time series decomposition
Example
We use air passenger data as an example for the decomposition model
import pandas as pd
from infrastrategy.kats.consts import Params, TimeSeriesData
from infrastrategy.kats.utils.decomposition import TimeSeriesDecomposition
#Reading in the data and creating a TimeSeriesData object
DATA = pd.read_csv("air_passengers.csv")
DATA.columns = ["time", "y"]
TSData = TimeSeriesData(DATA)
# Creating an additive decomposition class
m = TimeSeriesDecomposition(TSData, "additive")
out = m.decomposer()
# Plotting the decomposition. The components are available in
# out
m.plot()