Cusum Detector
Cusum is a method to detect an up/down shift of means in a time series. In Kats implementation, it has two main components:
- Locate the change point: The algorithm iteratively estimates the means before and after the change point and finds the change point maximizing/minimizing the cusum value until the change point has converged. The starting point for the change point is at the middle.
- Hypothesis testing: Conducting log likelihood ratio test where the null hypothesis has no change point with one mean and the alternative hypothesis has a change point with two means.
And here are a few things worth mentioning:
- We assume there is only one increase/decrease change point;
- We use Gaussian distribution as the underlying model to calculate the cusum value and conduct the hypothesis test;
API
# Model Class
class CusumDetector(data)
Methods
detector(
threshold= 0.01,
max_iter= 10,
delta_std_ratio= 1.0,
min_abs_change= 0,
start_point= None,
change_directions= None,
interest_window= None,
magnitude_quantile= None,
magnitude_ratio= 1.3,
magnitude_comparable_day= 0.5
) # run detector
plot() # plot results
Parameters
threshold: float, significance level;
max_iter: int, maximum iteration in finding the changepoint;
delta_std_ratio: float, the mean delta has to be larger than this parameter times std of the data to be consider as a change;
min_abs_change: int, minimal absolute delta between mu0 and mu1
start_point: int, the start idx of the changepoint, None means the middle of the time series;
change_directions: list<str>, a list contain either or both 'increase' and 'decrease' to specify what type of change to be detected;
interest_window: list<int, int>, a list containing the start and end of the interest window where we will look for a change point. Note that the llr will still be calculated using all data points;
magnitude_quantile: float, the quantile for magnitude comparison, if none, will skip the magnitude comparison;
magnitude_ratio: float, comparable ratio;
magnitude_comparable_day: float, maximal percentage of days can have comparable magnitude to be considered as regression.
Outputs
the detector method outputs a tuple of (if the regression is detected, metadata for the change) The metadata includes:
changepoint: change point index,
mu0: mean before change,
mu1: mean after change,
changetime: The time of the changepoint,
stable_changepoint: if the changepoint converged,
delta: difference between mu0 and mu1,
llr_int: the log likelihood in the interest window,
llr: the log likelihood in the full time series,
p_value: p_value,
regression_detected: if the regression detected
Examples
Simulate data set
from infrastrategy.kats.consts import TimeSeriesData, TimeSeriesIterator
from infrastrategy.kats.detectors.regressionDetection import CusumDetector
import numpy as np
import pandas as pd
# simulate data
np.random.seed(10)
df = pd.DataFrame(
{
'time': pd.date_range('2019-01-01', '2019-03-01'),
'increase':np.concatenate([np.random.normal(1,0.2,30), np.random.normal(2,0.2,30)]),
'decrease':np.concatenate([np.random.normal(1,0.3,50), np.random.normal(0.5,0.3,10)]),
}
)
Detect increase
timeseries = TimeSeriesData(
df.loc[:,['time','increase']]
)
detector = CusumDetector(timeseries)
# run detector
detector.detector()
# plot the results
detector.plot()
Detect decrease
timeseries = TimeSeriesData(
df.loc[:,['time','decrease']]
)
detector = CusumDetector(timeseries)
# run detector
detector.detector()
# plot the results
detector.plot()
Use interest_window
timeseries = TimeSeriesData(
df.loc[:,['time','increase']]
)
detector = CusumDetector(timeseries)
# run detector
detector.detector(interest_window=[40,50])
# plot the results
detector.plot()