• API

›Utilities

Forecasting

  • Autoregressive Neural Network (AR_net)
  • Quadratic Model
  • Linear Model
  • KatsEnsemble
  • Empirical Confidence Interval
  • STLF
  • Theta
  • Holt-Winter’s
  • Prophet
  • SARIMA
  • ARIMA

Detection

  • BOCPD: Residual Translation
  • BOCPD: Bayesian Online Changepoint Detection
  • Outlier Detection
  • ACFDetector
  • Seasonality Detector
  • Cusum Detector

TSFeatures

  • TsFeatures

Multivariate

  • Multivariate Outlier Detection
  • VAR

Utilities

  • Model Hyperparameter Tuning
  • Backtesting
  • Time Series Decomposition
  • Dataswarm Operators

Backtesting

Kats offers 4 different ways of backtesting your model.

  1. Simple Backtesting
  2. Fixed Window Ahead Backtesting
  3. Expanding Window Backtesting
  4. Rolling Window Backtesting

In addition to these methods, users can use the CrossValidation class to create their own folds for training and testing their model.

The four backtesting methods are each implemented as their own class, inheriting most attributes and functionalities from their abstract parent, BackTesterParent.

BackTesterParent

API:

# Abstract Parent Class
class BackTester(
        self,
        error_methods: List[str],
        data: TimeSeriesData,
        params: Params,
        model_class,
        multi: bool,
        offset=0,
        **kwargs
    ):

Parameters:

error_methods: list. List of strings indicating which errors to calculate
               (see backtesters.ALLOWED_ERRORS for exhaustive list)
data: TimeSeriesData. Kats TimeSeriesData object
params: Params. Kats Params objects for the model parameters
model_class: Untyped. Kats model object representing the forecasting model
multi (optional): boolean. flag to use multiprocessing to run each fold in parallel.
       Can throw errors if enabled when code is already running in a child process.
       Default True for rolling window and expanding window backtests.
offset (optional): int. Gap between training and testing datasets. Default 0.

Public Attributes:

results: list. List of tuples (training_data, testing_data, trained_model,
         forecast_predictions) storing forecast results
errors: dict. Dictionary (string -> float) mapping the error type to value
params: Kats Params objects for the model parameters
size: int. Number of datapoints
error_funcs: dict. Dictionary (string -> function) mapping error name to function
             that calculates it
freq: str. Frequency of pandas dataframe (inferred)
raw_errors: List of numpy.arrays storing raw errors (truth - predicted)

Public Methods:

run_backtests(): # Creates all folds and runs backtests on them
get_error_value(error_name): # Gets the error value for the given error name

BackTesterSimple

API:

class BackTesterSimple(
        self,
        error_methods: List[str],
        data: TimeSeriesData,
        params: Params,
        train_percentage: float,
        test_percentage: float,
        model_class,
        **kwargs
    ):

Additional Parameters (beyond Parent Class BacktesterParent):

train_percentage: float. Percentage of data used for training (between 0-100)
test_percentage: float. Percentage of data used for testing (between 0-100)

Example:

import pandas as pd
import logging
from infrastrategy.kats.utils.backtesters import BackTesterSimple
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.arima import ARIMAModel, ARIMAParams

# Read in the data
DATA = pd.read_csv("air_passengers.csv")
DATA.columns = ["time", "y"]

# Create Kats Objects
TSData = TimeSeriesData(DATA)
params = ARIMAParams(p=1, d=1, q=1)
ALL_ERRORS = ['mape', 'smape', 'mae', 'mase', 'mse', 'rmse']

# Run Backtesting
backtester = BackTesterSimple(ALL_ERRORS, TSData, params, 75, 25, ARIMAModel)
backtester.run_backtest()

# Log Errors
for error, value in backtester.errors.items():
    logging.info(error, value)

"""
Output:
mape 0.11196918606389973
smape 0.11831396728056609
mae 52.10695557149991
mase 2.5646017691584593
mse 5202.055885499536
rmse 72.1252791017098
"""

BackTesterFixedWindow

API:

class BackTesterFixedWindow(
        self,
        error_methods: List[str],
        data: TimeSeriesData,
        params: Params,
        train_percentage: float,
        test_percentage: float,
        window_percentage: int,
        model_class,
        **kwargs
    ):

Additional Parameters (beyond Parent Class BacktesterParent):

train_percentage: float. Percentage of data used for training (between 0-100)
test_percentage: float. Percentage of data used for testing (between 0-100)
window_percentage: int. Percentage of data used for fixed window (between 0-100)

Example:

import pandas as pd
import logging
from infrastrategy.kats.utils.backtesters import BackTesterFixedWindow
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.arima import ARIMAModel, ARIMAParams

# Read in the data
DATA = pd.read_csv("air_passengers.csv")
DATA.columns = ["time", "y"]

# Create Kats Objects
TSData = TimeSeriesData(DATA)
params = ARIMAParams(p=1, d=1, q=1)
ALL_ERRORS = ['mape', 'smape', 'mae', 'mase', 'mse', 'rmse']

# Run Backtesting
backtester = BackTesterFixedWindow(
                ALL_ERRORS,
                TSData,
                params,
                50, # Train Percentage
                25, # Test Percentage
                25, # Window Percentage
                ARIMAModel,
             )
backtester.run_backtest()

# Log Errors
for error, value in backtester.errors.items():
    logging.info(error, value)

"""
Output:
mape 0.1576580864978473
smape 0.17958289869819613
mae 75.88164448597571
mase 4.806063120878033
mse 10227.195099143722
rmse 101.12959556501609
"""

BackTesterExpandingWindow

API:

class BackTesterExpandingWindow(
        self,
        error_methods: List[str],
        data: TimeSeriesData,
        params: Params,
        start_train_percentage: float,
        end_train_percentage: float,
        test_percentage: float,
        expanding_steps: int,
        model_class,
        multi=True,
        **kwargs
    ):

Additional Parameters (beyond Parent Class BacktesterParent):

start_train_percentage: float. Initial training window (between 0-100)
end_train_percentage: float. Final training window (between 0-100)
test_percentage: float. Percentage of data used for testing (between 0-100)
expanding_steps: int. Number of expanding steps to take (# of folds)

Example:

import pandas as pd
import logging
from infrastrategy.kats.utils.backtesters import BackTesterExpandingWindow
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.arima import ARIMAModel, ARIMAParams

# Read in the data
DATA = pd.read_csv("air_passengers.csv")
DATA.columns = ["time", "y"]

# Create Kats Objects
TSData = TimeSeriesData(DATA)
params = ARIMAParams(p=1, d=1, q=1)
ALL_ERRORS = ['mape', 'smape', 'mae', 'mase', 'mse', 'rmse']

# Run Backtesting
backtester = BackTesterExpandingWindow(
                ALL_ERRORS,
                TSData,
                params,
                50, # Start Train Percentage
                75, # End Train Percentage
                25, # Test Percentage
                3,  # Expanding Steps (num folds)
                ARIMAModel
             )
backtester.run_backtest()

# Log Errors
for error, value in backtester.errors.items():
    logging.info(error, value)

"""
Output:
mape 0.11646836437857695
smape 0.12484447779715938
mae 47.34768419814879
mase 2.663259144891901
mse 4180.568820457178
rmse 64.3500798305892
"""

BackTesterRollingWindow

API:

class BackTesterExpandingWindow(
        self,
        error_methods: List[str],
        data: TimeSeriesData,
        params: Params,
        train_percentage: float,
        test_percentage: float,
        sliding_steps: int,
        model_class,
        multi=True,
        **kwargs
    ):

Additional Parameters (beyond Parent Class BacktesterParent):

train_percentage: float. Percentage of data used for training (between 0-100)
test_percentage: float. Percentage of data used for testing (between 0-100)
sliding_steps: int. Number of rolling steps to take (# of folds)

Example:

import pandas as pd
import logging
from infrastrategy.kats.utils.backtesters import BackTesterRollingWindow
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.arima import ARIMAModel, ARIMAParams

# Read in the data
DATA = pd.read_csv("air_passengers.csv")
DATA.columns = ["time", "y"]

# Create Kats Objects
TSData = TimeSeriesData(DATA)
params = ARIMAParams(p=1, d=1, q=1)
ALL_ERRORS = ['mape', 'smape', 'mae', 'mase', 'mse', 'rmse']

# Run Backtesting
backtester = BackTesterRollingWindow(
                ALL_ERRORS,
                TSData,
                params,
                50, # Train Percentage
                25, # Test Percentage
                3,  # Sliding Steps (num folds)
                ARIMAModel
             )
backtester.run_backtest()

# Log Errors
for error, value in backtester.errors.items():
    logging.info(error, value)

"""
Output:
mape 0.15088136891940276
smape 0.14958146228491798
mae 57.53915506835433
mase 2.991962943893486
mse 5395.149798608472
rmse 73.08384900313399
"""

CrossValidation

The CrossValidation class allows the user to create folds, train the model on them, and record the errors. Currently, the user can create folds via an expanding window or rolling window approach.

API:

# Abstract Parent Class
class BackTester(
        self,
        error_methods: List[str],
        data: TimeSeriesData,
        params: Params,
        train_percentage: float,
        test_percentage: float,
        num_folds: int,
        model_class,
        rolling_window=False,
        multi=True,
    ):

Parameters:

error_methods: list. List of strings indicating which errors to calculate
               (see backtesters.ALLOWED_ERRORS for exhaustive list)
data: TimeSeriesData. Kats TimeSeriesData object
params: Params. Kats Params objects for the model parameters
train_percentage: float. Percentage of data used for training (between 0-100)
test_percentage: float. Percentage of data used for testing (between 0-100)
num_folds: int. Number of folds
model_class: Untyped. Kats model object representing the forecasting model
rolling_window (optional): boolean. Flag to use rolling window method. Default False
multi (optional): boolean. flag to use multiprocessing to run each fold in parallel.
       Can throw errors if enabled when code is already running in a child process.
       Default True for rolling window and expanding window backtests.
offset (optional): int. Gap between training and testing datasets. Default 0.

Public Attributes:

results: list. List of tuples (training_data, testing_data, trained_model,
         forecast_predictions) storing forecast results
errors: dict. Dictionary (string -> float) mapping the error type to value
size: int. Number of datapoints
raw_errors: list. List of numpy.arrays storing raw errors (predicted - truth)
backtester: child class of BackTesterParent. Backtesting Object that does
            training & evaluation

Public Methods:

run_cv(): # Creates all folds and does model training & testing on each.
get_error_value(error_name): # Gets the error value for the given error name

Example:

import pandas as pd
import logging
from infrastrategy.kats.utils.backtesters import CrossValidation
from infrastrategy.kats.consts import TimeSeriesData
from infrastrategy.kats.models.arima import ARIMAModel, ARIMAParams

# Read in the data
DATA = pd.read_csv("air_passengers.csv")
DATA.columns = ["time", "y"]

# Create Kats Objects
TSData = TimeSeriesData(DATA)
params = ARIMAParams(p=1, d=1, q=1)
ALL_ERRORS = ['mape', 'smape', 'mae', 'mase', 'mse', 'rmse']

# Run Backtesting
crossvalidator = CrossValidation(
                    ALL_ERRORS,
                    TSData,
                    params,
                    50, # Train Percentage
                    25, # Test Percentage
                    3,  # Number of Folds
                    ARIMAModel,
                    rolling_window=True
                 )
crossvalidator.run_cv()

# Log Errors
for error, value in crossvalidator.errors.items():
    logging.info(error, value)

"""
Output:
mape 0.15088136891940276
smape 0.14958146228491798
mae 57.53915506835433
mase 2.991962943893486
mse 5395.149798608472
rmse 73.08384900313399
"""
← Model Hyperparameter TuningTime Series Decomposition →
  • BackTesterParent
    • API:
    • Parameters:
    • Public Attributes:
    • Public Methods:
  • BackTesterSimple
    • API:
    • Additional Parameters (beyond Parent Class BacktesterParent):
    • Example:
  • BackTesterFixedWindow
    • API:
    • Additional Parameters (beyond Parent Class BacktesterParent):
    • Example:
  • BackTesterExpandingWindow
    • API:
    • Additional Parameters (beyond Parent Class BacktesterParent):
    • Example:
  • BackTesterRollingWindow
    • API:
    • Additional Parameters (beyond Parent Class BacktesterParent):
    • Example:
  • CrossValidation
    • API:
    • Parameters:
    • Public Attributes:
    • Public Methods:
    • Example:
Kats Project
More
GitHubStar
Facebook Open Source
Copyright © 2021 Kats Project @ Facebook