Percent moving average
Data quality rule that verifies if a data quality check reading is comparable to average. An average is calculated based on previous results. The time window in which we define a number of previous results used to calculate average is customizable.
Parameters
This rule has two parameters that should be declared for each alert.
low
:
rule threshold for a low severity (1) alertmax_percent_above
: floatmax_percent_below
: float
medium
:
rule threshold for a medium severity (2) alertmax_percent_above
: floatmax_percent_below
: float
high
:
rule threshold for a high severity (3) alertmax_percent_above
: floatmax_percent_below
: float
Example
The following example shows how to implement moving_average
rule in regex_match_percent
sensor for a check.
The assigned severity depends on sensor result (see YAML configuration below):
Let's suppose that average X
is calculated based on last n
readings. If result is in defined ranges, then the following alerts are returned
(1 + low.max_percent_above/100) * X > result > (1 - low.max_percent_below/100) * X
, the check is passed: valid result(1 + medium.max_percent_below/100) * X < result < (1 - low.max_percent_below/100) * X
and(1 + medium.max_percent_above/100) * X
>result
>(1 - medium.max_percent_above/100) * X
the severity is low (1)(1 + high.max_percent_below/100) * X < result < (1 - medium.max_percent_below/100) * X
and(1 + high.max_percent_above/100) * > result > (1 - medium.max_percent_above/100) * X
the severity is low (2)(1 + high.max_percent_below/100) * X > result
or(1 + high.max_percent_above/100) < result
the severity is high (3)
As you can see, time series specification is defined. It's necessary to indicate which column in your table is timestamp. You can do it defining fields timestamp_column
, time_gradient
, and mode
in time_series
section.