String length in range percent
Description
The string_length_in_range_percent
check calculates the percent of string values whose length is within
a certain range provided by the user.
If a provided data type is not STRING
, the column is cast as one.
When to use
This check is useful when there is a need to validate value's length, e.g. social security number, postal code.
Info
To find value's length we use built in STRING
function LENGTH
. Keep in mind that white spaces are
included into the count.
Used sensor
String length in range percent
Accepted rules
Parameters
This checks has two optional parameters that configure date format to parse:
min_length
: Integer
minimal string's lengthmax_length
: Integer
maximal string's length
How to use
On the string column
Table configuration in YAML file for column check string_length_in_range_percent
on a column string_length
in table
table in test_data
dataset in dqo-ai
project on BigQuery and min_count
rule ,looks like this
min_length: 6
, max_length: 9
. The rendered query is
SELECT
CASE
WHEN COUNT(analyzed_table.`strings_length`) = 0 THEN NULL
ELSE
100.0 * SUM(
CASE WHEN LENGTH( analyzed_table.`strings_length` ) BETWEEN 6 AND 9 THEN 1
ELSE 0
END
) / COUNT(analyzed_table.`strings_length`) END AS actual_value
FROM `dqo-ai-testing`.`dqo_ai_test_data`.`table` AS analyzed_table
On the numeric column
Table configuration in YAML file for column check string_length_in_range_percent
on a column int_length
in table
table in test_data
dataset in dqo-ai
project on BigQuery and min_count
rule ,looks like this
SELECT
CASE
WHEN COUNT(analyzed_table.`int_length`) = 0 THEN NULL
ELSE
100.0 * SUM(
CASE WHEN LENGTH( SAFE_CAST(analyzed_table.`int_length` AS STRING) ) BETWEEN 6 AND 9 THEN 1
ELSE 0
END
) / COUNT(analyzed_table.`int_length`) END AS actual_value
FROM `dqo-ai-testing`.`dqo_ai_test_data`.`table` AS analyzed_table