sensortoolkit.qc._interval_downsampling.downsampling_interval

downsampling_interval(quant_df, thres_quant=0.99, plot_quantiles=True)[source]

Check if N times the median time delta is greater than the time delta at a threshold quantile (default is 99%) for each dataframe.

Say we have the following scenario where a sensor was configured to record data at 60 second intervals but the recording interval occasionally drifted to shorter or longer intervals:

  • threshold quantile ('thres_quant') = 0.99 (99th percentile)

  • threshold recording interval (recording interval at the 99th percentile) = 115 seconds

  • median recording interval (recording interval at the 50th percentile) = 60 seconds

On the first iteration of the downsampling_interval() method, the function will check whether 1*60 seconds is greater than the threshold recording interval. Since 60 < 132 seconds, this is not true, so the method will step the multipliying factor up by 1. The second iteration will check whether 2*60 seconds is greater than the theshold recording interval. Since this is true (120 > 115 seconds), the loop will exit and indicate that the dataset should be downsampled to 120 second intervals.

Parameters
  • quant_df (pandas DataFrame) – Dataset containing the time delta interval for measurements (for each dataset in the passed df_list) listed by quantile, ranging from 0 to 1 in 0.001 (0.1%) increments. The 0.5 quantile (50th percentile) corresponds to the median of time delta intervals for each dataset.

  • thres_quant (float, optional) – A threshold quantile (normalized between 0 and 1) for the distribution of time deltas in recorded datasets. Downsampling is applied for time delta intervals that are the first multiple of the median time delta that exceeds the time delta corresponding to the threshold quantile. Defaults to 0.99.

  • plot_quantiles (bool) – If True, create a figure displaying the distribution of time delta intervals in recorded datasets (relative frequency of recorded time deltas within each quantile interval vs. the time delta of consecutive recorded timestamps). Defaults to True.

Returns

The downsampling interval, in seconds.

Return type

interval (int or float)