Column datetime difference percent
Description
The column_datetime_differnece_percent
check calculates a time difference between records in corresponding columns,
compares those values to a threshold, and shows the percentage of rows where time difference is greater than the
threshold.
User has to configure three parameters: column1
, column2
check calculates datetime differences between records
in these columns , max_difference
which is a threshold value. The time_scale
corresponds to the time scale of the
datetime difference. This parameter is optional, and the default time scale is a DAY
.
Info
Although this sensor is designed to work on DATETIME
columns, it is possible to provide DATE
, TIMESTSMP
or
STRING
columns, only they will be cast as DATETIME
.
When to use
This check aims at informing the user about the amount of data that is considered timely.
Used template
Column datetime difference percent
Accepted rules
Parameters
column1
: str
the first column to calculate the time differencecolumn2
: str
the second column to calculate the time differencetime_scale
: str (optional)
a datetime difference timescale, accepted values:DAY
,HOUR
,MINUTE
,SECOND
(default:DAY
)max_diffference
: int
threshold to compare time differences, anything above this threshold is considered as delayed
How to use
The default configuration of column validity check column_datetime_differnece_percent
on columns reference_datetime
and real_datetime
with min_count
rule and default time scale.
DAY
.
SELECT
100.0*SUM(
CASE
WHEN
ABS(DATETIME_DIFF(analyzed_table.reference_datetime, analyzed_table.real_datetime, DAY)) < 10 THEN 1
ELSE 0 END
)/COUNT(*) AS actual_value
FROM `dqo-ai-testing`.`dqo_ai_test_data`.`test_data_timeliness_sensors_1287831065858823464` AS analyzed_table